<

AntiPHPatterns and AntiPHPractices

Published on
46,898 Points
4,998 Views
14 Endorsements
Last Modified:
Awarded
Community Pick

AntiPHPatterns

Design Patterns are independent solutions to problems that occur over and over again.  Usually these are considered at a high level without much detail.  They are independent of core-implementation things like the exact language or data base.  And at the opposite end of things we find AntiPatterns which appear to be solutions, but are actually counterproductive, ineffective or subtly dangerous.  In 2010 Stefan Priebsch coined the phrase AntiPHPatterns, to describe some of these high-level pseudo-solutions that appear in PHP.  His excellent description was presented in London at the UK PHP conference.  His collection fell into four general categories: Constantitis, Globalomania, Singletonitis, and Godclass.


In this article I'll touch on Stefan's excellent analysis, and then I will add a collection of my own AntiPHP examples, showing how novice programmers often over-think or under-think specific problems.  I have seen these things in my work over the years, and on the occasions when I have had the opportunity to refactor, the view in hindsight has helped me to understand why some newer ways of looking at problems can help create a library of effective teaching examples.


Constantitis

In PHP, the define() function creates a constant.  Constants are immutable and global, and they can be defined anywhere in the code base.  As a result, it is entirely possible that two programmers working on the same project may define a constant with the same name, thus causing a collision that results in a run-time failure.  The presence of many global constants is a code smell.  Among other things, it may imply a large measure of configuration over convention, and with many possible configurations, things can become confusing.  In addition, if you want to change the settings that depend on the constants, you have to change the constants in the script.  Yecch!


Globalomania

In PHP, the global keyword creates a global variable, one that appears suddenly and unexpectedly in every namespace and scope.  This is slightly worse than a global constant, since it can be changed, and a change in one area of the script may have unwanted effects in other areas of the script.


Singletonitis

The singleton design pattern tries to solve the problem of having multiple instances of the same class, when only one instance is needed (eg: a data base connection).  But in practice, singletons are often used as if they were global variable containers. Singletons may share same the code smell with other globals (constants and variables).  Singletonitis has the same cure as constantitis and globalomania, which is dependency injection.


Godclass

It is axiomatic in software development that encapsulation is good, both for code and data.  A well-written class should do one thing perfectly (or as close to perfectly as possible).  If you try to describe the purpose of a class and you use any conjunctions, your class is probably defective because it is doing too much work in too many different ways.  When you find that an entire web page can be rendered by calling upon a single class, you've found Godclass.  The couplings and dependencies of the classes that extend Godclass are not readily visible, making it hard to know what changes will have adverse effects.  The cure for Godclass is isolation of data and functionality, so that objects know "all about" themselves, but treat other objects as if they are "black box" vending machines.  In other words, "minimize class responsibilities."


So much for antiPHPatterns.  Now let's look at AntiPHPractices.


AntiPHPractices

AntiPatterns are the high-level concepts gone awry.  More plentiful and finer grained blunders and code smells are called AntiPractices.  If it seems like this article has a negative, "don't do that" tone, it's because the article is intended to steer you away from risky or dangerous practices!  If you find these things in your own programming, get rid of them as soon as you get a chance.


PHP is an old programming language.  There are a lot of things that were designed into the language in the 1990's.  That was before hacking, spam, viruses, and other security problems were even beginning to be understood.  As a result, many of the code smells relate to things that seemed convenient at the time, but turned out to be dangerous or wasteful in retrospect.  Unfortunately, bad code examples do not come with expiration dates, and some of them (well, millions of them) are still published on the internet.  They are easy to find and they get copied over and over by novice programmers, perpetuating the bad practices.  If you find a code sample on the internet ask yourself, "When was this written?"  Also, ask yourself, "Do I completely understand what the author as thinking and how it relates to my problem?"  If you're not 100% sure about both of those answers, skip over the example and go look for one that makes more sense to you!  


These coding horrors are in no particular order, and they may not always cause scripts to fail, data bases to be destroyed, money to be lost, etc.  But they have that potential.


1. Installing the Code Without Understanding the Code

For some reason, people feel that they don't need to learn PHP before they start using it!  Unfortunately, the internet is littered with simply terrible examples of PHP code written long ago by novice programmers who did not understand the principles of computer science or security, and who never took the time to clean up after themselves.  And a particularly frequent offender, DreamWeaver, generates some of the worst PHP code ever written.  Copying code from internet resources or DreamWeaver is a sure way to get yourself in trouble.  So don't do that.  Instead stick to quality learning resources from dependable authors.  Like everything else in life, knowing what you're doing with PHP is really helpful.


Want a good test for understanding the code?  Try to say or write a one-sentence explanation for every line of code in your script.  If you can do that, you understand the code.  If you find yourself unsure or at a loss for words, stop what you're doing and look up the functions and classes on php.net.  A moment invested in learning can save you from days of debugging.  Further guidance on this point comes to us from Alexander Pope in his elegant Essay on Criticism.


2. Not Having a Backup Copy

Really?!  You made a change to a deployed production system?!


3. Not Having a Test Data Set

Really?!  You are about to make a change to a deployed production system?!  You might want to rethink that!


4. Programming Without Error_Reporting(E_ALL)

If you do not tell PHP to show you all the errors, you will miss some of the errors!  PHP will suppress the Notice messages unless you ask PHP to tell you the Notice messages.  Always ask any programming language to tell you all of the errors!


5. Suppressing Messages with @

Whenever there is a reason to suppress a warning message, there is an even greater reason to find out the cause of the message and eliminate it!  If I see the @ prepended to a function call, I expect to see the comments explaining why that is a good practice.  In almost every case it's not a good practice, but was an expedient way for a sloppy programmer to avoid a step in the debugging process.  It's significant to note that if you use @ to suppress messages and a Fatal Error occurs in the function, the script will fail silently.  There will be no message or log, and no indication of where the error occurred.  Imagine trying to debug something like that!


6. Register Globals

This takes the danger of Globalmania and sprays it randomly and universally into your script.  Disable Register Globals!  If you're at a current level of PHP, it should be disabled already.  Maybe you want to use phpinfo() to verify this.


7. Sloppy Syntax and Formatting

This is a programmatic shibboleth that screams, "I don't know what I'm doing!"  PHP allows you to write ugly code, very ugly code, and PHP will still interpret it, and if the code does not fail for a parse error, PHP will try to run it.  This is unfortunate, because it leads novice PHP programmers to think that coding standards do not matter, that they can use silly or meaningless variable names, that alignment and readability of the code isn't important.  Nothing could be further from the truth.  Neatness counts, and the reason for neatness and tidy organization is so that humans can make sense of your script.  Those humans include you, if you revisit the programming after a day or two away from the task.  Adopt a coding standard and adhere to it rigidly.  You probably want to adhere to PSR-1 and PSR-2.  With self-discipline comes self-confidence.


7a. Misusing Quotes and Apostrophes

Quote marks and apostrophes have special meanings in most programming languages and PHP is no exception, providing the novice with an almost endless number of ways to create parse errors with fiddly punctuation.  What do you need to know, when and how to use or avoid them?  See this article about Quotes in PHP Programing


8. Poorly Chosen Variable Names

What is the likely meaning of a variable named $x?  It's hard to guess!  If it's expected to be today's date, why not use $today instead?  And if you're retrieving rows from several different queries, beware of using $row.  It's just too easy to get confused.  Instead, strive for meaningful variable names that clearly identify your thought processes.  And in related matters, choose data base column names that you can readily understand and that do not conflict with SQL functions or reserved words.


9. Compound Statements

We've all seen them... Someone wants to look smart, so they nest a collection of function calls into a single-line statement.  The problem with doing this is pretty simple.  If you get an unexpected return value, you have to take the code apart to find out what screwed up.  Why not just start with the code on separate lines and save yourself a debugging cycle?  PHP does not run any faster when there are compound statements in the script!  PHP has to compile the script before it can be executed, and the compiled output is exactly the same, whether the statements were written legibly or all jammed up together into an unreadable mess.  PHP does not care, so take the easy approach -- the one humans are most likely to understand.  These two blocks of code do the same work.  The signature of compound statements are adjacent closing parentheses.

 

// NORMALIZE, CAPITALIZE, ESCAPE THE COLOR
$rgb = $mysqli->real_escape_string(ucfirst(strtolower(trim($color))));

// A CLEANER WAY TO NORMALIZE, CAPITALIZE, ESCAPE THE COLOR
$rgb = trim($color);        // REMOVE WHITESPACE
$rgb = strtolower($rgb);    // MAKE ALL LOWER CASE
$rgb = ucfirst($rgb);       // CAPITALIZE FIRST LETTER
$rgb = $mysqli->real_escape_string($rgb);


9.a Compound Arguments

These are closely related to compound statements, and they create a similar problem, to wit, you cannot readily see what data was passed in the argument string.  Consider this example:

 

$res= $mysqli->query($sql2  . $start . ', ' . $limit); 

What does the query string contain?  Who knows?  You would have to print out three variables to begin to figure it out.  A better ways to write the code would look something like this:

 

$sql = $sql2  . $start . ', ' . $limit;
print_r($sql);
$res = $mysqli->query($sql); 


10. Using REGEX to Do Things That PHP Should Do for You

I came across this statement in a client's code recently:

 

$regexp="/^[a-z0-9]+([_\\.-][a-z0-9]+)*@([a-z0-9]+([\.-][a-z0-9]+)*)+\\.[a-z]{2,}$/i";

Maybe that made sense (to validate an email address) many years ago, but we've had PHP filter_var() since PHP 5.2.  Regular expressions are very hard to get right!  And they are even harder to get right if they're strung together in a long, uncommented string.  Experienced programmers have a joke that goes, "I had 99 problems, so I used a Regular Expression.  Now I have 100 problems!"


11. Writing Code to Do Things That PHP Should Do for You

You wouldn't believe how many novice programmers do not understand the way date() and strtotime() work together.  Instead of using the built-in functionality, they try to write their own date computation algorithms, and the resulting code is almost always overly complicated and usually wrong in the edge cases, like leap year, daylight savings time, changing time zones, etc.  This is not the only place that PHP built-ins get ignored, but it's fairly common.  For another example, see the list of array functions for a collection of things that novices overlook or misunderstand.


12. Using the Wrong Tools

"Please bring me a wrench," said the carpenter.  


"What kind of wrench" said the apprentice, "Pipe wrench, lock wrench, spanner wrench, monkey wrench, Stillsons wrench...?


"Doesn't matter -- I'm just going to use it to pound nails."


File systems and data bases are different kinds of storage, and understanding the differences is an important part of application development.  Have you ever seen anyone store image files in a data base BLOB?  You won't see them doing it for very long, because as the image collection grows the data base will become slower and slower, impossible to back up, and generally an impediment to application deployment.  If you start down this path, expect to find yourself refactoring (or maybe someone else will be refactoring because you will get fired).  Image files (and by extension, sound or multimedia files) belong in the server file system.  The associated data base information should contain the URL of the image, not the image file.


The flip side of this misstep is seen in applications where some well-intentioned but misinformed designer created a flat text file or an XML document so he could store his files without using a data base.  Usually this happens when the designer is trying to avoid learning what a data base does.  The problem with this approach is not performance, but functionality.  XML is by its nature hierarchical.  A data base is relational.  The first time you have to find the relationships in an XML document, you will wish you had learned about data bases instead!


13. Failure to Put Comments Into the Code

It doesn't always have to be a doc-block, but there should be something to tell people what you're thinking and what you're trying to do!  And a doc-block never hurts.  It can provide tips and code hinting in some IDEs.


14. Using the ?> Close-PHP Tag

This tag is almost invariably followed by an invisible whitespace end-of-line character, which is browser output.  It can and will cause more trouble than it's worth.  Especially because it is invisible!  Unless you absolutely require the close PHP tag, omit it.  Really, PHP knows when it's finished!  Even if DreamWeaver doesn't!


15. Using the Global Keyword

This is really only needed when the application designer hasn't given enough thought to the organization of the data and classes.  It's a signature of disorganized thinking.  Avoid global variables.  Pass parameters and return values instead.


16.$_REQUEST

Avoid using the $_REQUEST variable.  The reason for this is simple: you do not know where the request data came from!  Was it from your HTML form?  Or did a malicious person simply type the request variables into the URL?  If you choose $_POST and $_GET you at least know the request method.  It does not protect you from hack attacks, but it makes them a little harder to perpetrate.


Another reason to avoid $_REQUEST is because of the request_order directive.  This (positively stupid) configuration variable can cause the same script to produce different outputs if the directive is changed.  Why did PHP introduce this stumbling block at 5.3?  Your guess is as good as mine!


17.Extract()

Like the use of register_globals, any programming that injects unnecessary variables into the symbol table has a bad code smell.  The wise programmer avoids variable proliferation.


18. Writing Your Own Version of Extract()

You may have seen something like this:

 

$u = $_POST["username"];
$p = $_POST["password"];

This accomplishes nothing.  It just copies one variable to another, creating two versions of the same data.  Perhaps the programmer forgot about filter_var()?  It also introduces a subtle psychological element of trust that is completely misplaced.


$u = $_POST['username'] only copies the contents of $_POST['username'] into the variable named $u.  Whatever was present, good or bad, in the original POST request variable is now "conveniently" addressable inside the shorthand $u.  The act of including the unfiltered attack vector in a more convenient variable name is one big part of this anti-practice, because it seems to encourage the use of shorthand $u instead of the longhand and more explicit $_POST['username'].  It seems to me that $_POST is a highly recognizable external variable, and therefore readily identified as a tainted data source.  Not so, for $u, and it requires us to read the code and follow $u back to its origin before we can know whether to trust $u.  So if you're going to create a shorthand notation, filter and sanitize the data before assigning it to the shorthand variable. 


This anti-practice is an example of failure to sanitize external data, as shown below.


19. Failure to Filter/Sanitize External Data

Here is the classic screwup:

 

$sql = 'DELETE FROM myTable WHERE id = ' . $_GET['id'];

Do you have any idea how many rows will be deleted?  No, you do not.  Maybe you think it is only one row?  But there is no LIMIT clause.  Want to know how common this blunder is?  In November, 2013, Michelangelo Van Dam gave a presentation to the Washington DC PHP Users Group in which he searched GitHub for the intersection of mysql_query() and $_GET.  This is what we saw.GitHub SearchWhat if $_GET['id'] came from a URL that said: url.php?id=0+OR+1+=+1

The resulting query would say DELETE FROM myTable WHERE id = 0 OR 1 = 1 and the OR condition would apply to every row of myTable. Bang! You're Fired!  This anti-practice has been an object of ridicule for years.  It's as dangerous as driving on the wrong side of the road.  Yet for some odd reason, there are still programmers who have not heard of the issue.  See http://xkcd.com/327/


19.a Relying on JavaScript to Filter/Sanitize External Data

A variant on failing to filter external data is relying on client-side technologies to do the work that must be done on the server-side (see the relationship here).  JavaScript and jQuery are used to make a nice experience for your human client, but they provide no protection at all against an attack because the attackers will simply bypass the client-side niceties and will post toxic data directly into your application.  This is easily accomplished with cURL or fsockopen() and the attack can even be tailored to look like it's coming from a browser that was referred by the HTML form on your web site.  All external data is, by definition, tainted, full stop.  You must filter it with server-side tools, after it has been received on the server.


20. Failure to Escape External Data

The query says, "INSERT INTO myTable (name) VALUES ({$_POST['name']}" and that should work, right?  Sure, until you find someone named O'Reilly.  Then you have a stray apostrophe in your query string, and that's a recipe for failure.  Apostrophes have a special meaning in query strings, unless you escape them.


21. Double Escape of External Data

Consider the PHP environment is set up with Magic Quotes, and the programmer, not sure about this, uses addslashes() or similar functions to escape the data.  As a result the data base is cluttered with double backslashes.  A common symptom of this misstep is code that uses stripslashes() on the fields that are drawn from the data base.


22. Failure to Escape Client Output

Envision a script reads an input field and stores it in the data base, then later it echoes the data out to a client browser.  What could possibly go wrong?  Well, the input field could have been poisoned with toxic JavaScript that will cause the client browser to begin an action that is harmful to the client computer and eventually to the client human.  When you see echo statements without htmlspecialchars(), you have to wonder...


23. Uninitialized Variables

http://xkcd.com/1193/ shows what can go wrong if you don't know what a variable contains.  Unfortunately PHP suppresses Notice level messages by default.  So novice programmers think it is OK to use uninitialized variables to mean FALSE or zero or an empty string, because that level of offense only rises to the level of a PHP Notice.  You can get away with this right up until the time that your manager wants to make a small change to your script and she chooses the variable name you were assuming would be empty or set with a value you expected.  If you raise the error_reporting() level to show the Notice messages, PHP will tell you about this pitfall.


24. Failure to Test the Return Values from PHP Functions

You wouldn't believe how many times the same programming error occurs, over and over, because something like this gets copied and perpetrated again and again.

 

$res = mysql_query($sql);
while ($row = mysql_fetch_assoc($res)) { ...

What's wrong here is the assumption that the return value in $res is a query result resource.  It might not be!  It might be FALSE.  If the program does not test for this condition, the script will eventually fail silently or with data damage.  MySQL is not a black box.  It can and will fail for reasons that are outside of the PHP programmers' control.  If your script cannot control something, it must test for the uncontrollable condition (in this case, query failure) and handle the conditions as they arise.


25. Wasting Time Not Knowing What Happened

Script failed, no output, what's wrong?  Well, the first thing to do is start creating output!  Use var_dump() to print out the inputs your script is receiving.  And to print out the intermediate variables.  And to visualize the data before the output phase.


25.a Wasting Other People's Time Not Knowing What Happened

It's amazing how many questions are posted here at EE with a code example and the question, "what's wrong?"  The code is useless; we already know it doesn't work.  What we really need to see instead is the data.  If your data takes the form of a return object (for example, from an API) and you post the output of print_r() or var_dump() we can read it, but it's unwieldy when we want to write some code that works with the data.  PHP has a perfect tool for solving this problem.  It's a built-in function, var_export().  Look it up, learn it now, test it out, and please use it when you want to show a colleague your test data.  And since PHP is still a "living language" with some things that are only half-baked, you may also want to read this note.  Var_export() may need help to work with StdClass.

http://www.php.net/manual/en/reserved.classes.php#113971


26. Wasting Time on Meaningless Optimization

Not limited to PHP by any means, but PHP programmers seem particularly interested in minutiae like whether it is faster to iterate over an array or an object.  The answer is, "it doesn't matter."  These kinds of pursuits are like milking a mouse.  No matter how much effort you put into the process, you will not get much.

http://xkcd.com/1205/


27. Overlooking Meaningful Optimization

Data transfers via disk I/O operations are several orders of magnitude slower than in-memory data transfers, and when there is a performance problem in a web application it is usually found in the data base.*  You need to know how your database works and how to optimize the data base and script file structure.  Here are some topical questions to ask yourself.


Are your scripts using something like MySQL_Fetch_Array()?  Perhaps you copied that from one of the many bad PHP examples that litter the internet?  Change that function to a variant of Fetch_Assoc() or better yet, Fetch_Object().  The Fetch_Array() variant retrieves twice as much data as is needed, making it the least efficient way to retrieve your data.


Do you have a SELECT * query?  If so, eliminate the * and put in the names of the columns you want to retrieve.  SELECT * retrieves all of the columns, even the ones you do not need, making it the least efficient way to retrieve data.


Do you have a SELECT query that is expected to retrieve one row (such as a user-id lookup or a single inventory item)?  If so, be sure you have LIMIT 1 in the query.  Failure to LIMIT the query will result in a table scan, making it the least efficient way to find the information.


Does your code loop around to add values as it retrieves the results set from the MySQL server? Maybe you should be using server side aggregations with a GROUP BY clause.


Are your data base tables appropriately indexed?  You may want to have an index on every column used in WHERE, ORDER, GROUP, JOIN and HAVING clauses.


Are your queries sargable?


Have you used EXPLAIN SELECT to optimize queries that touch more than one table?


If you recognize some of these issues in your queries, or more importantly, don't understand some of them, you might want to read this excellent article by our EE colleague, gr8gonzo.  Entire books are devoted to MySQL and optimization, especially at large scale, is a highly technical and detailed endeavor.  But every programmer should have an understanding of the basics.


* Or sometimes in the API calls to other applications with slow data base queries!


28. Skipping Best Practices

You really want to just get it done fast, right?  See http://xkcd.com/292/  The cartoon is a joke, but the best practices are not.  They are the way experienced programmers have learned to get the best and fastest results.


29. Manipulative Programming

(Added in 2016) Undoubtedly you've been in a conversation with a technical manager who has no sense of marketing, and something like this has come up: "How can we force the users to ___ (fill in the blank)?"  The answer is not found in technical details; it's found in human nature and common sense.  If your clients feel like they are being coerced, they will go away, and that is that.  One obvious manifestation of this anti-practice is demanding that clients fill out a registration form before they can see free information, or refusing to accept forms until they are complete, or making a web site that fails if it cannot set HTTP cookies.  Don't do things that drive customers away!  Instead, be as open and informative and inviting as you can be.  You can bet your competitors are trying to be nice to your clients.  Don't lose your customers over a design flaw.


Summary

Knowing what to do is important, and so is knowing what not to do.  Do you have some examples of wrongheaded designs and practices?  Please share them in the article comments below.  If we're smart about things, we can learn to avoid the AntiPHPractices, choosing best practices instead.


Please give us your feedback!

If you found this article helpful, please click the "thumb's up" button below. Doing so lets the E-E community know what is valuable for E-E members and helps provide direction for future articles.  If you have questions or comments, please add them.  Thanks!

 

14
Author:Ray Paseur
Ask questions about what you read
If you have a question about something within an article, you can receive help directly from the article author. Experts Exchange article authors are available to answer questions and further the discussion.
Get 7 days free