RSS
 

Archive for the ‘PHP’ Category

Passed, named or query string params?

04 May

This is a question often asked. It’s also a question I had to find my own anwers to – and to redefine those answers over time.

So here are my five cents regarding this topic:

Persistence: Passed params

You use passed params where there is a definite order in your url regarding those params. So the second one never goes without the first one etc. One example is a blog with date posts and you could just use the year or year/month or even year/month/day to select posts:

/**
 * url: /posts/index/year/month/day/
 */
public function index($year = null, $month = null, $day = null) {}

Other examples: /controller/action/country/city/ or /controller/action/page/paragraph/

So basically, when the order is fixed and you always need all previous params to use a specific param.

Note: you can also access them using $this->request->params['pass'][0...x]. But since you always know the order, its usually easier to directly pass them into your method as variables.

Outdated flexibility: Named params

For me they are still in use in my places – mainly in older projects of mine.

Advantages

They work well with named routing and custom routes. You can easily make named params into passed ones:

// in the routes.php
Router::connect('/specials/offer/:slug', array('controller'=>'products', 'action'=>'view'), array('pass'=>array('slug')));
 
// url array
array('controller' => 'products', 'action' => 'view', 'slug' => 'kitchen-knife')
 
// resulting "pretty" url instead of "/products/view/slug:kitchen-knife/"
/specials/offers/kitchen-knife/

Issues

The encoding of named params can break urls. And, basically, they always violate the HTTP spec. They are cake specific (no other application uses volatile params this way) and no real standard.

Working with extensions can be quite the mess:

// url array
array('ext' => 'json', 'foo' => 'bar')
 
// resulting url
/controller/action/foo:bar.json

To sum it up: Migrate them to query strings (see the next paragraph).

Flexibility the right way: Query strings

This is the way to go for 2.x and especially for 3.x.

When to use them

If the order is irrelevant or if they can be combined in different ways. In many cases like pagination not all query strings are needed to get the desired result. So the are usually always optional.

Advantages

They do not have the encoding issues of named params and are pretty much the way all other web apps also use such volatile params. So it is the de facto standard for web apps.

They also work well with extensions:

// url array
array('ext' => 'json', '?' => array('foo' => 'bar')
 
// resulting url
/controller/action.json?foo=bar

Tip: Since CakePHP 2.3 you can use the convenience method CakeRequest::query() to read the url query array in an error free manner. Any keys that do not exist will return null:

$foo = $this->request->query('foo');
// returns "bar" in our example - or null if no foo key is found

Last words

As mentioned above my opinion regarding named params changed over time. I was working with them for years from 1.2 up to 2.2. But at some point I saw the disadvantages outweigh the advantages. In 2.x there have been also many improvements regarding query strings so that since 2.3 I always use and recommend query strings for new applications. And for legacy ones I recommend upgrading – if possible.

Insight

The 3.x docs state the following:

Named parameters required special handling both in CakePHP as well as any PHP or javascript library that needed to interact with them, as named parameters are not implemented or understood by any library except CakePHP. The additional complexity and code required to support named parameters did not justify their existance, and they have been removed. In their place you should use standard query string parameters or passed arguments

Read more about the usage of the param types above in the documention.

 
No Comments

Posted in CakePHP

 

Interesting (CakePHP/PHP) links – 2013

16 Apr

PHP5.5

Try/Catch/Finally is a nice article about upcoming 5.5 features – and how to use them wisely.

Enhance your select form fields

Chosen is a beautiful addon for your select dropdowns, especially if you use them to filter to search on a large set of data and if you do not yet use any AJAX here. Just add some markup and JS and you got a quick and easy way to handle them.

There are also some CakePHP Plugins for the Chosen script available already.

SQL to Cake find statements

You can use this converter to form your SQL snippets into Cake find() calls. This is especially handy for beginners, but can also be useful to automate some migration scripts etc.

 

Continuous integration with Travis and CakePHP

03 Apr

I must admit that I only recently started to take testing more serious. In the past I just created tests if too much time was available.

But the larger projects get, the more just minor changes will most likely break other pieces in them, as well. Most of the time you don’t even realize it until it is too late, and people report it broken after deployment. To avoid this, it is wise to have a good test coverage of your project. Also, it helps to have some software that tests your projects continuously and automatically. This is where Jenkins or Travis come into play.

In the following tutorial I want to focus on how use the free Travis CI to automatically test your github repositories automatically after every commit. Jenkins you would have to manually set up on your server. Travis is already available with only some minor configuration.

Note: For Jenkins there is some documentation in the cookbook.

Travis Setup

You need to get a Travis account first. The easierst way would be to sign in with your github account as this automatically sets everything up. Then you just need to enable the github repositories you want to be tested.

Github Setup

On github you also need to enable the corresponding “Travis hook” for your repositories. It will also need the token you can get from the travis settings page.

Then we need to make our “.travis.yml” configuration file. The documentation is pretty good, but I will still outline the major pitfalls. A basic version to test your “YourPluginRepository” plugin code for PHP5.3 and PHP5.4 would be sth. like this:

language: php
 
php:
  - 5.3
  - 5.4
 
before_script:
  - git clone --depth 1 git://github.com/cakephp/cakephp ../cakephp && cd ../cakephp
  - mv ../YourPluginRepository plugins/YourPlugin
  - sh -c "mysql -e 'CREATE DATABASE cakephp_test;'"
  - chmod -R 777 ../cakephp/app/tmp
  - echo "<?php
    class DATABASE_CONFIG {
    public \$test = array(
      'datasource' => 'Database/Mysql',
      'database' => 'cakephp_test',
      'host' => '0.0.0.0',
      'login' => 'travis',
      'persistent' => false,
    );
    }" > ../cakephp/app/Config/database.php
 
script:
  - ./lib/Cake/Console/cake test PluginName AllTests --stderr
 
notifications:
  email: false

Save this file to your repository root. After the commit and push Travis should be notified via hook and your first test build should already be in the queue there. You can find it at

https://travis-ci.org/[YOUR_GITHUB_USERNAME]/[YOUR_PROJECT_NAME]/

If you want to take a look at some more complex travis setups, take a look at my tools plugin or the cakephp repository. Both also use different CakePHP versions or database types.

Hooks

If you take a look at the Build-Lifecycle you can see that you have the possibility to install additional software/packages and configure your build to your liking prior to executing the tests.

Status image for your readme or website:

https://travis-ci.org/[YOUR_GITHUB_USERNAME]/[YOUR_PROJECT_NAME].png

Current status for my tools plugin

That was actually the main reason I started to try out Travis. The plugin got larger and more and more apps started to use it. I had to start thinking about how to make it more stable and reliable, especially after modifying parts of it. Regression tests really help here a lot!

Just as an example image: Build Status

I noticed, however, that this image does not change very often. So if your build does not pass anymore, don’t rely on this image. Better use the notification hooks to get alerted :)

Note

It seems I wasn’t the only one writing something about this. Independently, Mark Story wrote an alternative article about the the same topic.

 
 

CakePHP and Tree structures

17 Feb

Some of you might alread have worked with the TreeBehavior to generate nested categories or something like that. In most cases we just want to have two or three levels and some hierarchic structure using parent_id. But trees can do way more than that. At least if you also use lft and rght (which the behavior uses internally to order the tree) and MPTT (Modified Preorder Tree Traversal).

Unfortunately, a helper for tree-structered output is not a pat of the cake core. Luckily some skilled developer(s) created a very nicely working version for 1.x which I upgraded and enhanced for 2.x. It works flawlessly with any model that uses the TreeBehavior.

What for?

Navigation, Category Tree in Shop Systems, Threaded Boards with Posts or Comments, … The list where you can use models that behave like trees is endless. In my case we needed a complex category tree including “active path” feature and breadcrumbs. Also some additional magic to keep the tree in a visible “length” (to only show the revelant branches and only to a specific level).

Setup

The behavior can just be attached to the model itself using $actsAs – as always. For the helper we include it in our controller as $helpers = array('Tools.Tree');.

Your table should have “parent_id”, “lft”, “rght” fields. If you use UUIDs, make sure that “parent_id” is UUID (char36) just as your primary key. “lft”/”rght” must be integers, though. Otherwise your tree will always be invalid as those fields have nothing to do with the ids itself, but the order inside the tree.

From the documentation: “The parent field must be able to have a NULL value! It might seem to work, if you just give the top elements a parent value of zero, but reordering the tree (and possible other operations) will fail”.

Usage

TreeBehavior

I don’t want to go into the details regarding the core Tree behavior, as it is already very well explained in the documentation.

Just remember: Do not touch the lft/right fields. They should not be in your forms or be modified from you in any way. The behavior internally sets the right values here. You just need to tell it what parent_id you want to put it under.

If you already saved some records in your tree there are usually three ways of getting the data in a way that you can output it properly:

  1. find(all) in combination with 'order' => array('lft' => 'ASC') and maybe scope/conditions
  2. find(threaded) and 'conditions' => array('id' => $id, ...) and 'order' => array('lft' => 'ASC')
  3. children($id) if you have multiple trees in your table, for example – or if you want to retrieve only a part of the tree

The last two methods will already nest your data using parent_id/children as key. The first one you can nest yourself if needed using Hash::nest() as shown below.

Note that you must set the order yourself for both find() calls. Only children() will automatically use the correct order.

Short reference of useful behavior methods:

  • getPath: return current path to this id
  • children: get all children to an id
  • removeFromTree (with true/false to remove children or moving them up)
  • moveUp
  • moveDown
  • verify: check that the tree is valid
  • recover: if not valid, try to repair the tree (with mode return/delete)

You can put two up/down icons in your index table or threaded tree list pointing to two actions up/down which then invoke the behavior’s methods. This way you can easily sort your tree using those methods from the backend. You can also use some more sophisticated ajax dynamic tree reordering using jquery plugins etc. Then you would probably use reorder() as this method can reorder multiple items at once.

The current path is needed to build a breadcrumbs list. See the chapter for breadcrumbs below for details.

Another useful method (even though it shouldn’t be in the behavior but the view scope) is generateTreeList(). We can use it to populate our select boxes. A baked edit/add form for your categories should look like this:

...
echo $this->Form->input('parent_id');
...

Just pass down $parents in your action:

$spacer = '--';
$parents = $this->Category->generateTreeList($conditions, $keyPath, $valuePath, $spacer);
$this->set(compact('parents'));

TreeHelper

Way more interesting is how we can actually output the tree in a way that allows us to style it – especially the currently active path. Also how to fully customize each tree level.

The most simple use case – using the key/value (id and name usually) of the threaded array:

// in your view/element ctp 
echo $this->Tree->generate($categories, array('id' => 'my-tree'));

I will simply output a nested ul/li tree with the displayField (name) text.

We could also use helper callbacks (here a custom MyTreeOutputHelper::format() method) to adjust the nodes:

echo $this->Tree->generate($categories, array('id' => 'my-tree', 'callback' => ($this->MyTreeOutput, 'format')));

This method then can look like:

public function format($data) {
    if (empty($data['data']['Category']['visible'])) {
        return; // do not display
    }
    // append (active) for active path elements
    return $data['data']['Category']['name'] . ($data['activePathElement'] ? ' (active)' : '');
}

A more verbose example of the tree helper capabilities using elements:

$categories = Hash::nest($categories); // optional, if you used find(all) instead of find(threaded) or children()
 
$treeOptions = array('id' => 'main-navigation', 'model' => 'Category', 'element' => 'node', 'autoPath' => array($currentCategory['Category']['lft'], $currentCategory['Category']['rght']));
 
echo $this->Tree->generate($categories, $treeOptions);

And the /Elements/node.ctp, for example:

$category = $data['Category'];
if (!$category['active']) { // You can do anything here depending on the record content
    return;
}
echo $this->Html->link($category['name'], array('action' => 'find', 'category_id' => $category['id']));

Using autoPath we can make the tree leverage the lft/rght MPTT and automatically mark the current path as active. Styling it via css is then a piece of cake.

To enhance it further, you can use frontend js via jquery plugins (accordion or multi-level menu) or the quite powerful superfish script. If you want to divide your tree in a main top and a sub side navigation you can achieve that using the maxDepth option and only return and output specific levels of the tree per menu.

Tip: Take a look at the test cases for the helper for further details on the above options and its usage as well as the expected output for those.

Short reference for the more important settings:

  • ‘model’ => name of the model (key) to look for in the data array. defaults to the first model for the current controller. If set to false 2d arrays will be allowed/expected.
  • ‘alias’ => the array key to output for a simple ul (not used if element or callback is specified)
  • ‘type’ => type of output defaults to ul
  • ‘itemType => type of item output default to li
  • ‘id’ => id for top level ‘type’
  • ‘class’ => class for top level ‘type’
  • ‘element’ => path to an element to render to get node contents.
  • ‘callback’ => callback to use to get node contents. e.g. array(&$anObject, ‘methodName’) or ‘floatingMethod’
  • ‘autoPath’ => array($left, $right [$classToAdd = 'active']) if set any item in the path will have the class $classToAdd added. MPTT only.
  • ‘maxDepth’ => used to control the depth upto which to generate tree
  • ‘splitCount’ => the number of “parallel” types. defaults to null (disabled) set the splitCount, and optionally set the splitDepth to get parallel lists

And internally (on top of the above settings) in callbacks and elements the following information passed in as array (callback) or variables (element) is available:

  • ‘data’ => the data array itself
  • ‘depth’
  • ‘hasChildren’
  • ‘numberOfDirectChildren’
  • ‘numberOfTotalChildren’
  • ‘firstChild’
  • ‘lastChild’
  • ‘hasVisibleChildren’
  • ‘activePathElement’

Performance

Don’t forget to add some indexes on your tables to speed up the “reading” process. This is most likely the bottle neck of larger trees. Therefore you should add indexes for parent_id, lft and rght:

ALTER TABLE  `categories` ADD INDEX  `lft` (  `lft` );
ALTER TABLE  `categories` ADD INDEX  `rght` (  `rght` );
ALTER TABLE  `categories` ADD INDEX  `parent_id` (  `parent_id` );

Breadcrumbs

We can use the behavior’s method for this as mentioned above:

// controller
$treePath = $this->Model->getPath($currentCategoryId);
$this->set(compact('treePath'));

Now we just have to display the list and style it:

$total = count($treePath);
echo '<ul id="category-breadcrumbs" class="breadcrumbs">';
echo '<li>';
echo $this->Html->link('ALL', array('action'=>'find', 'category_id' => ''));
echo '</li>';
foreach ($treePath as $key => $treeCategory) {
    if (!$treeCategory['Category']['active']) {
        continue;
    }
    echo '<li>';
    if ($total === $key + 1) {
        echo h($treeCategory['Category']['name']);
    } else {
        echo $this->Html->link($treeCategory['Category']['name'], array('action'=>'find', 'category_id' => $treeCategory['Category']['id']));
    }
    echo '</li>';
}
echo '</ul>';
}

You will notice that the last element will not be a link anymore but a normal <li> tag. This way we can style it as the current (active) node in a different way to the other path elements.

Tip: You could also use the existing helper method addCrumb() as well as getCrumbList() of the HtmlHelper and output your breadcrumbs this way. If you don’t need any special treatment of your nodes, that is.

More experimental stuff

For very large trees like in category navigation structures with >> 100 category nodes it probably makes sense to only display the current level, and all direct siblings in the “active path”. It can also be a factor for search engines to only link the “relevant” cross-links here. I experimented with the hideUnrelated option and a custom callback or element to manually hide the elements marked as 'hide' => true.

Tip: It does need a nested structure (so make sure you use the right methods from above) and the threePath passed in as option. See the test case for details.

The following keys are then also available in callbacks/elements:

  • show => if it has to be shown as part of the active path
  • parent_show => if it is a child of an active path element and should also be visible
  • hide => if it should be hidden (tops the other two settings)

Yet undecided

I have been thinking about removing the find(all) support in favor of supporting always nested array input. This would probably make the code way shorter and easier to read and maintain. As outlined above using Hash::nest() you can always form your array this way prior to passing it into the helper. So the overhead here could be removed.

Feel free to submit any ideas, criticism or PRs (pull requests). I just recently started to seriously work with tree structured data.

 
5 Comments

Posted in CakePHP

 

CakePHP Tips

22 Jan

All new CakePHP tips collected over the last few weeks.

Dispatcher execution order

Tested on Cake2.3RC:

  • webroot/index.php
  • Config/core.php
  • Config/bootstrap.php
  • dispatchers defined in core/bootstrap
  • Config/routes.php
  • Config/database.php
  • controller and all the rest

It is important to know that the dispatchers will be triggered before your routes are even loaded. If you enabled a dispatcher like the CacheDispatcher, the last three elements might not even be triggered anymore (if the cached view file can be found) and the content might directly get sent at this point.

Resolve dispatcher conflicts

So if you implemented some routing for subdomains or other domains locked on the same application you need to make sure that the CacheDispatcher, for example, does not create a conflict. You can use the new Cache.viewPrefix (cake2.3) here in your bootstrap to not serve the wrong cached file here.

Callback execution order

Sometimes the execution order can be pretty important. At this point it is good to know what callback is triggered at what point – and the order is what you need/expect.

Note: Don’t trust what somebody tells you. Do it yourself. So instead of just believing my post here, execute the code yourself, if possible. You can have yourself some neat little callback execution tests for times where you need them.

Behavior/Model callbacks

// on save
Array
(
    [0] => TestBehavior::beforeValidate
    [1] => TestComment::beforeValidate
    // validate
    [2] => TestBehavior::afterValidate
    [3] => TestComment::afterValidate
    [4] => TestBehavior::beforeSave
    [5] => TestComment::beforeSave
    // save
    [6] => TestBehavior::afterSave
    [7] => TestComment::afterSave
)
 
// on find
Array
(
    [0] => TestBehavior::beforeFind
    [1] => TestComment::beforeFind
    // find
    [2] => TestBehavior::afterFind
    [3] => TestComment::afterFind
)
 
// on delete
Array
(
    [0] => TestBehavior::beforeDelete
    [1] => TestComment::beforeDelete
    // delete
    [2] => TestBehavior::afterDelete
    [3] => TestComment::afterDelete
)

Controller/Component callbacks

Array
(
    [0] => TestComponent::initialize
    [1] => TestController::beforeFilter
    [2] => TestComponent::startup
    [3] => TestController::test // action itself
    TestController::beforeRender
    // rendering view
    [4] => TestComponent::shutdown
    [5] => TestController::afterFilter
    // output
)

This makes sense. The components first initialize themselves and might modify the controller’s “init” state prior to its dispatching of beforeFilter. Then the components do their work before and after the action itself and at the end afterFilter is invoked prior to outputting the result.

When actually redirecting the following callbacks will not be executed anymore, though:

// in case of a redirect:
    [0] => TestController::redirect();
    [1] => TestComponent::beforeRedirect();
// redirect if not returned false, otherwise continue

Note: The redirect will only happen if the component’s callback does not return false here.

Helper callbacks

This is a little bit more tricky. The templating in cake uses a two-pass-rendering. First the view will be rendered and then it will be “injected” into the layout. So you cannot just echo debug output, you need to log the execution order if you want correct results here.

Using my test case for it, we get:

Array
(
    [0] => TestHelper::beforeRender
    [1] => TestHelper::beforeRenderFile
    // render view
    [2] => TestHelper::afterRenderFile
    [3] => TestHelper::afterRender
    [4] => TestHelper::beforeLayout
    [5] => TestHelper::beforeRenderFile
    // render layout and insert rendered view into "content" block
    [6] => TestHelper::afterRenderFile
    [7] => TestHelper::afterLayout

Interesting is, that beforeRenderFile and afterRenderFile are invoked for each file. If you include elements they will also be invoked for them. They are quite handy if you want to directly modify the rendered result in some way. This can also be a decision maker on whether to make some “markup snippet” an element or a helper method. Elements can also be cached, as well.

Note: beforeRender and afterRender call also be additionally invoked for each element. But you would need to manually enable those (default is false).

Test case callbacks

Please see this post for reference.

“Indirect modification” for pagination

If you still happen to use the 1.3 syntax in your 2.x apps, you might have run into something like this:

Notice (8):
Indirect modification of overloaded property PostsController::$paginate has no effect [APP/Controller/PostsController.php, line 13]

I wanted to fix this in the cake core controller, but it seems, it might be wiser to upgrade to the new PaginatorComponent syntax. If that is not possible, you can easily avoid this notice by adding this to your AppController:

/**
 * The paginate options for this controller
 *
 * @var array
 */
public $paginate = array();

This way the array is defined and accessing it the “old” way will work smoothly.

Some new PHP “flaws” I stumpled upon (and how they affect your cake app)

Yeah, there really are some ugly truths to PHP sometimes. Most can be avoided or resolved easily, though (if known!). The following one for example has been in Cake for years (until 2.3 stable) in the FormHelper – until someone finally ran into the issue. And there will still be lots of other places where PHP itself creates a mess if not handled accordingly.

Don’t use in_array with mixed integers and strings

$result = in_array(50, array('50x'); // returns TRUE (usually unexpected)
$result = in_array('50', array('50x'); // returns FALSE (expected)
 
// the other way is the same (shit):
$result = in_array('50x', array(50); // returns TRUE (usually unexpected)
$result = in_array('50x', array('50'); // returns FALSE (expected)

As you can see, using the first argument without any casting can result in some unexpected results if you do not know it. You can also use true as third param to make the comparison strict:

$result = in_array(50, array('50f5c0cf-5cd8', true); // returns FALSE (expected)

There might be some case where the above “strange” result is the expected – but in most cases probably not. It sure is one the most stupid PHP flaws I have ever seen. The string 50x is never numeric and therefore should never be casted to a numeric value here IMO.

You can also use yourself a neat little Utility method that actually works as expected here (at least the test cases pass now): Utility::inArray().

Use STRICT comparison where possible, but ALWAYS for strings

list($var1, $var2) = array("1e1", "10"); // clearly two different strings
var_dump($var1); 
var_dump($var2);
var_dump($var1 == $var2);
var_dump($var1 === $var2);
// Result:
string(3) "1e1"
string(2) "10"
bool(true) // !!!
bool(false)

Probably the most evident example is:

$x = 0;
$y = 'foo';
if ($x == $y) {
    echo 'same';
} else {
    echo 'NOT same';
}

What do you think this echos? Yeah, “same” due to the implicit int cast happening here!

Just to be clear: This is not just a bug (although I see it as one – as its the root cause of the above in_array issue!) – this is “expected behavior” in all PHP versions (including PHP5.4 and above). Unfortunately even more advanced (cake) programmers do not see the need to use strict comparison for strings. Even though, there are no downsides at all. The opposite, it is slightly faster and always more correct. I started to revise my own code regarding this just a few weeks back and also went ahead and fixed quite a few of those flaws in the core lately.

For other types like integers or arrays it is also wise to switch to strict comparison wherever possible. It usually improves code quality in a language that is sometimes just to lose regarding type-safety. And it triggers errors/warnings earlier than it would without it if you do something wrong. So it also might cut down development time in the long run.

Some more examples:

$count = count($users); // returns int
if ($count === 1) {} // always an integer

The $count must be of the type integer (does not come from user input or database) here and therefore can be uses with strict comparison.

public function foo(array $someArray) {} // you cannot pass strings, booleans, integers, ... now

If you always expect an array you can use array typehinting. Just bear in mind it will also reduce the flexibility to pass objects witch implement ArrayAccess interface. If you only use it internally you are pretty safe to use the typehint here, though.

 
2 Comments

Posted in CakePHP

 

CakePHP and SEO

29 Dec

I do have to admit that in the past I never really paid too much attention on SEO (Search Engine Optimization) or how search engines treat my sites in general. This can be quite a pitfall and cost you quite a bit of visibility these days, though.

With this post I will not go too much into details, but pick out some major points that really can improve your page rank and visibility. I will also mainly focus on the technical aspects here (for content non-framework related stuff please consult your SEO expert).

robots.txt

This is a file for search engines (SE) to quickly find out which paths they are supposed to ignore – and more.

User-agent: *
Disallow: /outbound/
Disallow: /js/
Disallow: /css/
Disallow: /files/
...

Put your static files and controllers that you do not want to be indexed and/or followed.

sitemap.xml

Especially google, but also many other SE, likes to have such a pre-formatted way of telling it how your site is structured and how it should be recognized. It will also provide all the basic page urls right away. You can read more about it here. There are even some validators out there you can use to check your sitemap is valid.

There are CakePHP plugins that are able to generate sitemaps for you. This way your sitemap.xml file will be created dynamically on demand and will always be up to date.

Avoid duplicate content

Slash or no slash

As outlined in this article, it is very important to only route to either slash or not slash at the end of your actions. The first one is preferred by SE. This means, that /controller/action should 301 redirect to the correctly routed /controller/action/. But the second one is used in Cake. And since this one works out of the box with the current 2.x Router class, it might make sense to stick to it. Personally, I also favor the no-slash-option since it makes your life so much easier, especially with Cake.

Just place this snippet in your htaccess:

RewriteCond %{REQUEST_URI} (.*)/$
RewriteRule ^(.*)$ %{REQUEST_URI} [R=301,L]

But for the sake of completeness I will also outline how to get it work vice-versa. For the trailing slash option we need to do a little bit more. There will be several steps necessary to achieve this side-wide.

First of all, you need to make sure you create the desired urls in your application. Since the trailing version is usually preferred, let’s try it that way. Put something like this in your AppHelper to generate trailing slashed urls in your views:

const EXCLUDE_ENDING_PATTERN = '/\\.([a-z0-9]+)$/i';
const EXCLUDE_JS_PATTERN = '/\bjavascript:/';
 
/**
 * Exclude pattern for special endings and js code
 * otherwise SEO trailing slash is added
 */
public function url($url = null, $full = false, $trailingSlash = true) {
    $routerUrl = h(Router::url($url, $full));
    $lowerRouterUrl = strtolower($routerUrl);
    if ($trailingSlash && substr($routerUrl, -1) !== '/' &&
        !preg_match(self::EXCLUDE_ENDING_PATTERN, $lowerRouterUrl) &&
        preg_match(self::EXCLUDE_JS_PATTERN, $lowerRouterUrl) === 0 &&
        !ereg('#', $lowerRouterUrl)
    ) {
        $routerUrl .= '/';
    }
    return $routerUrl;
}

It tries to avoid the trailing slashs for file extensions. This can be further optimized, but the basic idea should be clear: for all “normal” and “real” urls we add the trailing slash.

For all redirects inside your application you need to adjust the redirect method in your AppController:

/**
 * Try to prevent htaccess 301s due to missing / at the end. We redirect to the correct url directly.
 */
public function redirect($url, $status = null, $exit = true) {
    if (is_array($url) && empty($url['ext'])) {
        $url = Router::url($url, true);
        if (substr($url, -1, 1) !== '/') {
            $url .= '/';
        }
    }
    return parent::redirect($url, $status, $exit);
}

For urls generated in your controller via Router::url() you need to manually adjust the url:

$url = Router::url(array('some' => 'param')) . '/';
$this->Model->processOrStoreUrlInSomeWay($url);

Last but not least you need to modify your virtual host setup or the htaccess and add the following snippet:

RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ %{REQUEST_URI}/ [R=301,L]

It will redirect (301 – permanent) non-trailing-slash urls to the one with the trailing slash. Doing the redirect this early in the htaccess takes the overhead away from php and makes this very fast.

Now all internal linking (both links and redirects) should be addressed and working as expected.

Side note: Browsers will usually cache 301 redirects. If you debug your redirects, keep that in mind.

Multiple urls to the same content

If there are pages that contain the same content or if an action can be visited via multiple different urls we have a problem called “duplicate content” and can be penalized by SEs.

Use the canonical tag to tell the SE which one is the “real” page that should be indexed. All pages that have a canonical tag linked to the parent page (usually containing the same content) will not be regarded as duplicate content. Issue resolved.

<link rel="canonical" href="/controller/action/passed/" />

This should be in the <head> tag of your HTML layout.

You can safely add it to every page. If it is equal the current page it will just be ignored. We usually use this to start off:

if (!isset($canonical)) {
    $canonical = $this->request->here();
}
if ($canonical !== false) {
    echo '<link rel="canonical" href=" . $thi->Html->url($canonical) . '">';
}

This can still produce DC, though, as /users/index/ would route to the same page as /users/. You can avoid that by calling Router::url() instead which would produce the second url. Also make sure that you handle passed and named params as well as querystrings here. We do that in the controller scope (component usually) and pass the cleaned canonical url to the view and thus skipping the default and potentially incorrect $request->here().

Side-note: This is also a good time to talk about passed and named params and/or querystrings. Sometimes people asked when to use what. As a guideline passed params are more stable due to the fixed order and usually help building the url. They will then probably be part of the canonical link, as well. Named params and querystrings are usually more versatile and therefore more for adjustment and filtering. In most cases it makes sense to filter them out altogether and setting the canonical to the url with only passed params. For pagination without sorting/filtering the page param could be handled separately, though (see below), and whitelisted here. There are many use cases where the querystrings might very well have a right to be in the canonical url. A whitelist of some sort might help here, as well.

Title and meta tags

The title together with the description” meta tag are very important. They both are used by SEs to find the page via seach terms and to display a small excerpt.

<meta name="description" content="My Description of this page!" lang="en" />

For pagination it is recommended to add the current page after the title:

if (!empty($this->params['named']['page'])) {
    $title .= ' (' . __('Page %s', $this->params['named']['page']) . ')';
}

More meta tags can be quite important:

The content type tag should already be part of your layout:

<?php echo $this->Html->charset(); ?>

The content language meta tag is used to tell the SE which language this page is in:

<meta http-equiv="content-language" content="de" />

For more specific language definition you can add the regional part:

<meta http-equiv="content-language" content="de-DE" />

But as they seem to become obsolete in the future, it seems that the most future-proof approach is to additionally set the lang attribute for the html tag:

<html lang="de">

Keywords can help the SE to decide if the page is relevant:

<meta name="keywords" content="Comma,Seperated,Keywords" />

Don’t use too many keywords on one page, though. SEs are known to penalize or blacklist your site for abuse.

For telling SEs how to treat the current page, you can use the robots meta tag:

<meta name="robots" content="index,follow,noarchive" />

This would tell the SE to index the current page, follow all links and not to cache (archive => available offline) this page.

<meta name="robots" content="noindex,nofollow" />

This would make the SE disregard this page and its links completely.

Although still widely used, some SE might ignore it. So more and more move to other more reliable options such as robots.txt or htaccess directly.

Is there more?

Yes, there is. As SEO is a topic where there will always be new developments. One new element, for example, is the hreflang-canonical-tag combination. We tried it and failed at it (so far), so we had to revert the changes made. The outcome was not what we expected. If you succeeded in using it with a multilingual site, please share your doings and results.

Also, I have been thinking about modifying the core Router class and submitting this as a PR (Pull Request) and enhancement for future versions of CakePHP to handle this issue a little bit more gracefully. See my ticket for details. This would allow us to do this in a clean way.

Appendix

This snippet helps to prevent double slashes to create duplicate content:

RewriteCond %{REQUEST_URI} ^(.*)//(.*)$
RewriteRule . %1/%2 [R=301,L]

It removes additional slashes leaving only a single one (at the end as well as between):

/controller/action// becomes /controller/action/
/controller//action becomes /controller/action
 
1 Comment

Posted in CakePHP

 

CakePHP and HTTP/1.1

23 Dec

Most probably didn’t even know that Cake internally switched from HTTP/1.0 (cake1.x) to HTTP/1.1 (cake2.x). While most changes regarding the new specification are great, they can also break existing functionality, of course.

New features

With the new protocol there are a few new features that can be used now in Cake:

Caching improvements

Conditional view rendering is one of the new features possible with HTTP/1.1 and a few new headers available now.

New status codes

Interesting is the new code 410 (Gone) which can be “used when a resource has been removed permanently from a server, and to aid in the deletion of any links to the resource”.

Of course, it has a lot of other improvements over the old protocol, as well. You can read about them here.

file_get_contents() issues

In an external php file we need to get some data. This script is quite old and simply gets some serialized data from the main server:

$content = file_get_contents('http://mainserverurl.com/some/action/');

Now, with the upgrade to 2.x those scripts suddently take 15 seconds. We wondered what could be the cause of the issue – discovering that in HTTP/1.1 “all connections are considered persistent unless declared otherwise”.

Client-side solution

According to this question and its answer here it is possible to still use file_get_contents:

$context = stream_context_create(array('http' => array('header' => 'Connection: close\r\n')));
$content = file_get_contents('http://mainserverurl.com/some/action/', false, $context);

You manually tell the server to close the connection after the retrieval.

Server-side solution

In our case, and since the script was also used by servers we do not directly control, we had to make our cake app send the appropriate header:

// inside our action:
$this->response->header('Connection: close');

You manually tell the client to close the connection after the retrieval.

So what do we learn from that?

Better don’t use unintelligent php methods in favor of curl or other approaches. But sometimes you are outside Cake (and therefore not able to use the HttpResponse class) and maybe not even able to use curl. Then it helps if you know how can get it working again, after all.

 
1 Comment

Posted in CakePHP

 

User-Switch for CakePHP apps

15 Dec

What I call a user-switch is nothing more than a simple select box to chose a user from with which you want to be logged in as.

Why

This can be a very useful tool and time saver for multiple reasons. You might want to quickly edit/see the frontend just as any of your users, for example. In most cases the password itself is a personal one and only known to the user itself (which is fine!). Therefore an admin can safely jump into a user’s account without compromising the password here. But please read disclaimer below (final notes).

Basic idea

First of all you need an admin view or a sidebar element containing a select box of your users and a post button submitting to your switch action. This select box should only be visible to you if you are currently logged in as admin. After submit this action then overwrites the session date for Auth. After the redirect you should be logged in as this new user just as if you used the login form. But you did not have to know the password or use any master password solution here which can be potential security issues. You are already verified as authenticated role and this renders it safer then those other approaches where just about anybody could try to trigger the switch.

Using this action you can also switch back to your original account again.

Tips

After using it for several years now in multiple different ways and apps I will try to outline a few sticking points.

You should store an Admin.id in your session Auth data to “know” if you are the real user or the fake (switched) one. This is handy if you do NOT want to trigger certain things like “online activity update” or “message read” etc which only the real user should IMO. this way you can prevent this.

Also make sure only the admin role that will do the initial switch has access to the switch action. You can also allow the Admin.id access, of course (if you want to be able to switch back again).

You can add -1 as default value in your select box (at the very top for example) to switch back to the own account and handle that separately in your switch action.

After a successful change don’t forget to redirect to a page all roles can access to prevent any redirect loops here.

Implementation

The quickest way would be to use my Tools Plugin and include the CommonComponent in your public $components array. Alternatively you could also use my DirectAuthenticate directly – or even Auth->login() manually. But for the sake of simplicity here my DRY approach:

You need a switch action in your users controller:

public function switch() {
if ($this->Common->isPosted()) {
    $id = $this->request->data['User']['id'];
    if ($this->Common->manualLogin($id)) {
        // success message and redirect
    }
    // error message and referer redirect back to the action we posted from
}

The manualLogin() method can automatically log in the user just as Auth->login($user) would. Only that the first one is a cleaner approach regarding settings such as scope and contain which will automatically be transferred over from the Auth->authenticate settings. So if you contain Role and Customer relations for example, the logged in user via manualLogin will always have the same session data just as he would by normally logging in. This is IMO more robust than using find() and login() where you might end up forgetting to adjust the find options here due to the code redundancy.

Even if you don’t want to use any of my code, the basic idea of how it works should still be clear. The main point here is that “swapping” the Auth data in your session makes it possible to jump.

More advanced implementation

Cake2.3 using onlyAllow():

$this->request->onlyAllow('post');
if (!$isAdminRole && !$this->Session->check('Auth.Admin.id')) {
    throw new MethodNotAllowedException(__('Access not allowed'));
}
 
$formerId = $this->Session->read('Auth.User.id');
$id = $this->request->data['User']['id'];
if ($this->Common->manualLogin($id)) {
    if (!$this->Session->check('Auth.Admin.id')) {
        $this->Session->write('Auth.Admin.id', $formerId);
    } elseif ($this->Session->read('Auth.Admin.id') == $formerId) {
        $this->Session->delete('Auth.Admin.id');
    }
    // success message and redirect
}
// error message and referer redirect back to the action we posted from

Using the Auth.Admin.id we can easily find out when we are back as the original user and remove this session data. This way everything is back to the way it was before. We also assert that only the admin (switched or not) has access. Don’t forget to manually populate $isAdminRole here with your (ACL) way of retrieving this information.

Final notes

This feature should not be used for social network sites or other similar apps where there will be “private” user content. You can use this for websites, where you got employees, for example, which all are aware of the fact that the admin can temporally take over the user account and the users do not store sensitive information or send private messages themselves.

This is also very helpful in testing and during development. Just make sure you only display and allow switching with debug mode on or with env(‘REMOTE_ADDR’) === ’127.0.0.1′ etc.

Have fun :)

 
1 Comment

Posted in CakePHP

 

Localized number formats for forms

09 Dec

Often your language is not english and therefore the locale settings for decimals are not . for decimals and , for thousands, but the opposite, for example. If you do not allow localized input/output in your CakePHP apps it often times confuses users. So it is wise to do so.

You want to convert the localized values to the internal number format on save() and vice versa on find() prior to populating the form on edit.

I use my NumberFormat behavior to accomplish all those things in a clean way.

Important: This is only for form fields. Do not use to generally format your find() results. That’s what the View helpers like Time and Number are for. You need to format and output those values in a localized form in the view as needed using their methods.

Setup

I assume you already got the Tools Plugin up and running. You can either attach the behavior statically:

public $actsAs = array('Tools.NumberFormat' => array('output' => true, ...));

or you can dynamically load it at runtime for the specific add/edit actions with custom options (usually the better way for customization):

$this->ModelName->Behaviors->load('Tools.NumberFormat', array('output' => true, ...));

Next you should set a global default localization pattern. You can use Configure:

$config['Localization'] = array(
    'thousands' => '.',
    'decimals' => ',',
);

You can also use the system setting

setlocale(LC_NUMERIC, 'de_DE.utf8', 'german');

in combination with localeconv set to true. But this often times can have some side-effect else-where.

Basic usage

For add actions we don’t need any output, so we simply use

$this->ModelName->Behaviors->load('Tools.NumberFormat');

For edit actions we also need the localized output:

$this->ModelName->Behaviors->load('Tools.NumberFormat', array('output' => true, 'fields' => array('custom_float_field')));

We also want the behavior to convert the field custom_float_field here.

Advanced usage

Multiply

If you want your users to input percentages as interger values (0 … 100 instead of 0.0 … 1.0), you can use multiply:

$this->ModelName->Behaviors->load('Tools.NumberFormat', array('multiply' => 0.01));

This would convert 50 to 0.5 on save and undo it back to 50 on read (with output set to true, of course!).

Strict

My experience is, that too lose validation for decimals can result in a mess. Often times, the resulting price in our database was 100 times the inputted value just because the person used . or a invalid combination of multiple ., , or both. You can easily prevent this by expecting a valid localized value using strict:

$this->ModelName->Behaviors->load('Tools.NumberFormat', array('strict' => true));

This would not allow . (dot) as decimal if your localized decimal is currently , (comma) – among other things.

Before

Sometimes you do not need/want to validate your data, use before to delay the converting:

$this->ModelName->Behaviors->load('Tools.NumberFormat', array('before' => 'save'));

This will then execute on save() even if you set $validate to false.

Tips

If you already loaded your behavior statically, and you want to change the settings, use unload() prior to load():

public function edit() {
    $this->ModelName->Behaviors->unload('NumberFormat');
    $this->ModelName->Behaviors->load('Tools.NumberFormat', $myNewConfig);
}

Note: to unload a behavior, do not use the Plugin. prefixed syntax here.

For more examples and use cases see the test case.

Question

How do you manage to provide localized forms? Clearly CakeNumber itself is not enough here.

 
No Comments

Posted in CakePHP

 

CakePHP Tips – Winter 2012

01 Dec

Some more tips I gathered the last couple of weeks and want to share with the (Cake/PHP)world :)

Jquery and CakePHP FormHelper::radio()

Usually you would use

var value = $('input[name=fieldName]:checked').val();

But since CakePHP uses name="data[ModelName][field_name]" generating the form fields you cannot just use input[name=data[ModelName][field_name]]. Two brackets inside each other rare not allowed. So you need to do it this way:

var value = $('input[name=\'data[Default][status]\']:checked').val();

The escaped field name will make it work again.

How to dynamically overwrite the error message

The most important fact for me is: This was the reason I finally figured out how to override the error message from within a custom validation method. I played around with it and with some help from stackoverflow/irc I solved the problem I had not been able to solve so far. So with that I want to start.

Imagine a custom rule that calls a webservice or another class which then could throw an exeception. We want to catch the message of it and display it instead of the default message “Something went wrong” :) It is actually pretty easy, just do not return false but the error string itself:

public $validate = array(
        'url' => array(
            'validateUrl' => array(
                'rule' => array('validateUrl'),
                'message' => 'valErrDefaultError',
            ),
        ),
    );
 
    public function validateUrl($data) {
        $content = array_shift($data);
        try {
            ...
        } catch (Exception $e) {
            //crazy stuff which is working but NOT RECOMMENDED in this case:
            //$this->validator()->getField('url')->getRule('validateAvailability')->message
            //return false;
 
            //just return the error message you want to replace the default one with
            return $e->getMessage();
        }
        //no exception thrown - seems to be fine then
        return true;

PS: The commented out code SHOULD NOT be used, but would also work. Those are from my trial and error runs figuring it out. Using the new validator object you can access the properties prior to their usage.

But the simple and correct way here is just to return the error string. So in our case the form field will have the error message from the exception now.

PS: Returning boolean false will make the original error message defined in your validation array to be displayed, of course.

The book already has quite a thorough documentation on how to Dynamically-change-validation-rules. Using the validator object basically saves you the trouble working with the $validate array directly.

Avoid “pass by reference” outside of method scopes

<?php
$spell = array("double", "toil", "trouble", "cauldron", "bubble");
foreach ($spell as &$word) {
    $word = ucfirst($word);
}
foreach ($spell as $word) {
    echo $word . "\n";
}

Results in

Double
Toil
Trouble
Cauldron
Cauldron

An issue well known to many (here or here). So avoid it in view templates where you might re-use the variable somewhere later in the code or always remember to unset it after the foreach loop etc. The advantage of using small methods in helpers, components, etc is that the scope of the variable is only inside this method and can not affect anything else anywhere.

And in general, trying to avoid the pass by reference is often the best approach. At least where memory is not an issue.

 
1 Comment

Posted in CakePHP