JFilterInput insecure?
Friday, 08 July 2011 13:28

Issue: The filtering of input data is not bulletproof
Impact: Security problems, XSS, etc.
Version: Joomla 1.5, Joomla 1.6
JFilterInput is the main class to filter the input data from the user side in both 1.5 and 1.6. It has been put under some scrutiny in Joomla 1.6 by a very talented security expert and he found numerous issues in the extremely complex class. Most of those were fixed, but is the class now bulletproof? I'm not sure. Regard this posting more as a rumour and not as hard facts.

For some time in May 2010 I tried to check this class for errors, specifically because there were unittests in place that didn't pass and basically allowed some XSS. So I fooled around in the code and at some point finally got it to pass all unittests with flying colors. Shortly before 1.6 was released, I played around with input data for entirely different reasons and stumbled over a string that passed JFilterInput even though it shouldn't. I reported it and it later became evident, that this vulnerability was introduced by my fix from May 2010.

Since then, several people from the PLT have been fixing bugs in this class, which have been reported by the mentioned security expert, but a quick little test showed me, that not all holes have been fixed. Yes, you have to test around a lot to find a combination of single quotes, double quotes and other characters to actually exploit anything, but the filtering result overall is not satisfactory from my point of view. However, there are two things that concern me more than this perceived issue. The first one is, that there has never been a properly planned and executed security audit of Joomla, even though it was said that this would be happening before the release of 1.6. The second issue to me is, that the JFilterInput class from 1.5 and 1.6 are pretty similar. So similar actually, that the only changes that I can really see are security bug fixes and added comments for 1.6. And that in return would mean, that most likely the JFilterInput in 1.5 is holey like a swiss cheese...

So, is 1.6 insecure? Maybe. Is 1.5 insecure? Very probable.

Don't get me wrong, I'm not blaming anybody that the self-written input filtering class from Joomla is not working completely, because filtering is an impossibly complex issue, especially proper HTML filtering. Almost no PHP project out there is getting this truly right and even those existing solely to solve this feat rarely succeed. I personally know of only one filtering solution so far that seems to solve this satisfactorily and that is HTML purifier. Their comparison chart is very interesting.

What is the conclusion from all of this? First of all, we have to more thoroughly check JFilterInput for Joomla 1.5, 1.6 and 1.7 and we have to make sure that this class is fixed, too, in 1.5. Second, we should consider dropping the functionality in this class in favour of a solution that is proven to be working and where we don't have to invest months of development work to get something that is already readymade for us out there. Now I'm not a big fan of big, massive third party libraries (for example I'd like to remove TinyMCE in favour of a small mootools editor like mooeditable) but in this case, it looks like the choice is simple and the big code size a small price to pay.

Add a comment
Broken core component routing
Saturday, 02 July 2011 10:14

Issue: The routing of the core components is broken by design
Impact: wrong URLs, duplicate content, SEO
Version: Joomla 1.6
I already hinted here that the routing in the Joomla core components is broken. I'm going to elaborate on this here now.

First some theory:
Joomla is heavily based on the ItemID, which basically is the concept of component configurations per menu entry. This means, that each and every URL in Joomla needs an ItemID. At the same time, the best/correct ItemID for a page in Joomla can change, since it depends on the menu system. As a rule of thumb for the correct menu item/ItemID to choose: Take the ItemID which requires the least number of clicks from clicking on the menu item untill you are on the page that your URL should point to.
A little example: If you have two menu items, one pointing to the categories view of com_content, the other pointing to a category view and now want to find out the ItemID for an article in the category, the correct ItemID would be the menu item linking to the category view, since you only have to click once to get to the article instead of twice, first on the category and then the article.

So in order to have correct ItemIDs, we need to calculate that ItemID for each URL each time we render it and specifically can not hard-code it into the link. To solve this issue, the class ContentHelperQuery was introduced in 1.5 and this concept was extended onto all other core components in 1.6. With this class, before we send the URL to JRoute to create a "nice" URL out of it, we calculate the ItemID by handing it over to a view specific function. For an article view in com_content, the call would look like this:

<?php echo JRoute::_(ContentHelperQuery::getArticleView($article->id, $article->catid)); ?>

Now, did we solve our problem correctly this way? No. Because not all our URLs are created by PHP code, some of them are part of user content and besides that, its not really convenient for a developer to be required to think about these things. (Its also not convenient to have to think about how to properly write a URL, but that is another issue.)
If you have an internal URL in user content, you don't want to add the ItemID hard coded to that URL, since it might change and then your URL is broken entirely. However, JRoute also does not know which function to execute to find the right ItemID. For that, JRoute would have to have some information about the internal structure of the component and that is not the case in the current Joomla routing system. The only part of that routing system, that has component specific knowledge, is the component router itself. Knowing this, the only right place to find the ItemID is the component router itself. So we move the ContentHelperQuery class into the router and do the call in the router itself, instead of inside the call of JRoute::_() and everything is dandy, right?

Again: No. The next issue is the way Joomla handles SEF and non-SEF routing. When SEF URLs are enabled, Joomla includes the respective component router and lets it handle the URL. But if SEF URLs are disabled, Joomla basically just makes sure that the URL is XHTML compatible. (replacing & with its HTML entity &amp;) So we wouldn't even execute the logic to find the ItemID. To actually fix this problem without putting yet another band-aid on top, we have to re-think the way how the Joomla routing should work and maybe even remove the idea of a distinctive SEF URL mode. I have already done that and the solution is my molajo router plugin, which solves all these problems and even makes writing a component router a fun task. The Joomla project is of course welcome to use this plugin and adapt it into the Joomla core.

BTW: A good example of how this is broken, is the article linker plugin below the editor in your Joomla 1.6.


Add a comment
Hardcoded hitcounter in com_content
Friday, 01 July 2011 11:40

Issue: You can't switch off the hitcounter in com_content
Impact: Performance loss
Version: Joomla 1.5 - 1.7
com_content is somewhat trying to count the readers of an article. Its a noble attempt, but the way it is implemented is pretty much useless. The hitcounter basically only triggers, when you are looking at a single article, but not in a blog view. So the numbers that it produces are pretty much useless. However, this hitcounter has a serious downside: It stores the number of views in the content table itself, which means that the table is changed upon each view of a single article. This change however flushes the query cache of MySQL or in other words: Because you can't switch off the hitcounter in Joomla, MySQL has to do all those calculations for each query again, even if it is the same query from 5 seconds ago.

The solution would simply be a global parameter to switch this on or off.

Add a comment
Hardcoded rating implementation into com_content
Friday, 01 July 2011 11:36

Issue: article rating is hardcoded into com_content
Impact: Performance loss
Version: Joomla 1.6
com_content has a feature to rate articles with a 5 star system. This system is not really good and thus in Joomla 1.5 the visible output of the rating system has been moved into a plugin. However, the data retrieval and especially the actual voting process are still in com_content. While 1.5 disabled the data retrieval if the feature wasn't used, 1.6 now actually is in somewhat of an "always on"-state. So even if you are not using the Joomla rating system, you are joining the content table agains the ratings table.

The reason for this is, that you can't know if an article should be rateable before it is retrieved, because you don't have its parameters. The big downside however is an unnecessary join on a table that will be empty in 99% of all Joomla installations. As described in the com_content/com_contact coupling-issue, joining a large dataset on an empty table is extremely bad and here, too, the better solution would have been to create a seperate plugin and component to handle all of this. Or drop it completely like it was done with com_poll in 1.6.

BTW: Did you notice that you can't see the rating of an article in the backend?

Add a comment
com_content/com_contact coupling
Friday, 01 July 2011 11:27

Issue: com_content and com_contact have been coupled for a pretty useless feature
Impact: Big performance loss, wrong routing, loss of flexibility
Version: Joomla 1.6
The trigger for this blogpost is related to this post on the Joomla CMS Developer mailinglist:
This guy has around 70.000 articles in his system and now his system is crawling to a complete stop because of database issues. If you look into this, there are several easy to spot bugs in the way all of this is implemented.

Lets start simple: We look into /components/com_content/views/category/tmpl/blog_item.php on line 110

<?php if (!empty($this->item->contactid ) &&  $params->get('link_author') == true):?>
<?php     echo JText::sprintf('COM_CONTENT_WRITTEN_BY' ,
JHtml::_('link',JRoute::_('index.php?option=com_contact&view=contact&id='.$this->item->contactid),$author)); ?>

<?php else :?>
<?php echo JText::sprintf('COM_CONTENT_WRITTEN_BY', $author); ?>
<?php endif; ?>

The first bug here is JRoute::_('index.php?option=com_contact&view=contact&id='.$this->item->contactid). This URL does not have an ItemID and thus the link will always be something along the lines of "/component/contact/contact/<id>", even though there might be a menu item pointing to this contact. Not very nice. Although in my point of view the routing system is broken anyway, the correct code here would be JRoute::_(ContactHelperRoute::getContactRoute($this->item->contactid, $this->item->contactcatid)). (Notice that we are currently not retrieving the contacts category ID, which is the next bug...)

Then there is the use of JHTML::_('link'). Yes, in theory it is good style to generate every output by a specialized class, etcetera, yadda, yadda, yadda. Guys, we are talking about a freaking link here. We are always talking about designers not being coders and making it simpler for them. Using an unknown function to generate something that is the absolute minimum knowledge of everybody working in this business is not making it simpler. Its also more code and considering that many people are pretty good at reading XML, its also more difficult to read. Last but not least: This is the only time it is used in that layout. Heck, the whole Joomla codebase uses this code only 9 times, 4 times alone for this com_contact link in com_content.

Last stupid thing in this layout, I promise: In the first if-statement, we are first checking if there is a contactid and then we check if we even want to display it. This screams for notices to be thrown when the $item object does not have a contactid property. The proper way would be to switch the two statements. Why that is important will be obvious shortly.

After looking at the output, I'm going to go over to the source of the problem of the young man in the posting above: The model. This can be found in /components/com_content/models/articles.php

Lets look at lines 206 till 208:

// Join on contact table
$query->select(' as contactid' ) ;
$query->join('LEFT','#__contact_details AS contact on contact.user_id = a.created_by');

This code adds the table of com_contact to the query for the articles that should be displayed in this category. It does so without any conditional, specifically, it doesn't even care if we are using the feature "link author". You might ask "Why is that bad? Better prepared than sorry." The reason is pretty easy. If you are not using the "link author" feature, maybe even not using com_contact at all, the table of com_contact is empty. Adding a join on a table in a query that will create a lot of misses when looking up the data however is extremely bad. Databases don't really have a problem handling tables with a few million rows and even if you have a slow system, your database will have no real problem handling a few hundred thousand Joomla articles. But joining on an empty table completely annihilates that fact. Joining on empty tables is just bad. I can pretty much guarantee that the problems of that guy will be gone directly after removing or commenting these two lines of code. BTW: For the same reason, you should comment 215 and 216 that add the rating table to all of this.
There is a way to solve this issue and that is by checking in the model if the author should be linked and only then add the table from com_contact. This has a downside however: You wouldn't be able to enable this feature for single articles in a blog view of a category, since you wouldn't have the article parameters at that point yet. This again brings me to my strongest and most general point in all of this:

COM_CONTACT SHOULD NEVER EVER AND UNDER NO CIRCUMSTANCES BE CONNECTED LIKE THIS TO COM_CONTENT!!! I can't stress this enough. The most important principle of object oriented programming right after "everything is an object" is "decouple everything as much as possible". That means, that a component for a webshop is not dependend on a component for a forum and a contact plugin is not depending on a payment plugin. Because when that is the case like in our current situation, you are pretty much screwed. Not only does this create problems like our current one, which can't be solved without a new Joomla release or a core hack, but it also creates another problem: If you uninstall com_contact, com_content will flat out not work anymore. This is because the table that we are joining against will not be present anymore.

But how would we properly implement this? The proper way would be a plugin that is executed shortly before the layout is displayed. That plugin would then run over all the articles that will be displayed in just a moment, collect the IDs that need to be loaded from the database and then execute another query to fetch those. I already hear some of you say "additional queries are bad. BAD, I SAY!" but seriously, that would be an absolutely lightweight query and several magnitudes faster than the current solution. But most importantly: You could switch it off! and it also solves the above mentioned issue of enabling/disabling this for seperate users. Last but not least, you simply modify the value of the author property in the article object and you wouldn't even have to touch the layout anymore. To be honest, the trigger for the plugin to get to the data efficiently is missing in Joomla, but even without it, you could still solve this issue way more performant than it is done now.

This feature takes 5 lines of code. And it has 5 minor and major Bugs in those lines...

Add a comment