This wide- and large- screen layout
may not work quite right without Javascript.
Maybe enable Javascript, then try again.
This webpage is specifically about transforming the URIs sent by a user into something that better matches your webserver.
You may instead desire either a thorough general description of .htaccess or an in-depth description of exactly how .htaccess behaves (particularly useful if you're trying to do something arcane and it isn't working).
If your website is hosted on an Apache server, and you iwant clients to automatically adjust to your revised website layout, or iiwant clients to see friendly URIs even though your actual local path- and file-names are more complex, or iiifor better SEO want web search engines to get a cleaner view of your website, you will need to do some sort of rewriting of requested URIs. Apache offers four different ways to do this:
Use Symlinks
Pros: As they are an OS feature, symbolic links are very easy to create among Apache's files. Symlinks are available to Apache on almost all Apache webservers. Performance of the OS is barely affected, and performance of Apache itself is not affected at all. More than most other options (but still not perfectly), symlinks can be modified one at a time on live websites. No .htaccess file is needed. And symlinks may be the only available option on some shared webservers.
There is no centralized list of all potential URI rewrites, so if there are more than just a few symlinks it's very easy to lose track of exactly what is being done. This makes both backup/recovery and migration to a different webserver difficult. It can also make it difficult to recognize patterns, nonsensical URI rewriting, and potential loops. It can only do the equivalent of internal-redirects, (not external-redirects) which will be a notable limitation to some (but not all) websites.
Use <if>...</if> <elseif>...</elseif> <else>...</else>
Pros: This XML-like syntax is simpler and more similar stylistically to the system-wide Apache configuration file (probably httpd.conf). It allows implementation of logical programming structures in ways that look similar to other computer languages. All the variables and all the expressions of mod_rewrite continue to be available through this new interface.
Cons: This option is only available in versions 2.4 and later (earlier versions will simply return a 500 code to the client if this syntax is used). Penetration of the 2.4 version has been slow, and it may not be available to you, particularly if you use a shared webserver. And possible future website migration to a different webserver could be blocked because the new webserver is at a lower version. Apparently diagnostic messages are practically nonexistent and debugging is quite difficult. (To be fair though, debugging with other alternatives is also problematic.) And documentation and examples are quite sparse.
Use mod_alias
Pros: Can do most of what mod_rewrite does, but with a simpler and more intuitive syntax. It can do both internal-redirects (specified as an Alias ... directive) and external-redirects (specified as a Rewrite... directive). It's much clearer from the syntax it's a specification-oriented system, not some sort of procedural language.
Cons: Unfortunately this option is often not even considered, perhaps because the name mod_alias doesn't initially seem to have anything to do with rewriting, or perhaps because of the widespread misconception that you cannot do SEO-related things with it. Diagnostic messages are practically nonexistent and debugging is quite difficult. Conditionals are limited (but not giving you the rope to hang yourself isn't necessarily a bad thing). If later you find yourself wanting to do one of the things mod_alias can't handle (such as examining or modifying the Query string, or testing anything other than the URI itself), you may need to translate everything to mod_rewrite, as mixing the two isn't recommended.
Use mod_rewrite
Pros: It can do pretty much anything you can imagine. If you actually get it working in production, you get bragging rights as a certified nerd. It can manipulate every bit of a URI, including the Query string. It integrates alternate ways of setting cookies and of manipulating environment variables, so you can be sure all your logic stays in sync. Pretty much any complex conditional can be expressed (but sometimes only through some trickery).
Cons: It's syntax is weird (it can feel like a general procedural language even though it isn't), existing code is hard for someone else to understand, maintenance is rather error-prone, diagnostic messages are practically nonexistent on many systems and debugging is quite difficult, and significant gotchas await (such as very common looping in naive mod_rewrite .htaccess files). Extensive conditionals allow you to use overly complex solutions, rather than forcing you to keep your design simple. Documentation maintained and supplied by the Apache Software Foundation says 'mod_rewrite' should be considered a last resort, when other alternatives are found wanting. Using it when there are simpler alternatives leads to configurations which are confusing, fragile, and hard to maintain. Understanding what other alternatives are available is a very important step towards 'mod_rewrite' mastery.
Three of the options provide (but do not require you to use) the full power of PERLish/PCRE regular expressions (also called regex or RE), so that's seldom a differentiator. (What this providing of powerful REs really means is that no matter which option you use, a regex nerd will likely be able to express the equivalent of your final result in fewer shorter statements ...but so what?)
(Note that with most alternatives, if you have root access to a dedicated webserver there are more reasonable debugging options. But on a shared webserver, I know of no reasonable debugging options for any of the alternatives. So build up your .htaccess file just a little bit at a time, testing at every step, and stopping immediately whenever something doesn't work quite right or behaves in an unexpected way. Then solve and correct [or at least fully understand] that issue before continuing. By doing things in baby steps you won't wind up with a large mass of code that doesn't work but you have no idea why not and no idea where to start or what to look for.)
The rest of this webpage assumes you've selected the mod_rewrite alternative (sometimes derogatorily called the old way). To create just a simple mod_rewrite recipe in a .htaccess file, just following the few simple rules of thumb below should bypass all the potential problems. (But to create a more complex mod_rewrite recipe, understanding more about how mod_rewrite actually works will likely be invaluable.)
(Another way to skim over all the whys and just hit the important do this is to look only at the accented [light plain background like this] portions below.)
Processing of .htaccess files by mod_rewrite is a little odd. There are a couple good reasons for this. First, as normal Apache rewriting and redirection are all finished long before individual .htaccess files are even reached, rewrite modules must undo then redo then fake out everything. And second, to address the conflicting requirements of providing the best performance in a process that's done for every single request (not just once) while keeping the configuration fairly simple, the processing is sometimes more unusual than what you'd naively expect. And although some things about mod_rewrite are minutely documented, many others are hardly documented at all.
Its quirks can make mod_rewrite seem much harder than it really is (especially if you try to create a simple naive .htaccess file without knowing how to avoid the gotchas). These webpages list standard workarounds for the gotchas, common idioms, and other usage hints.
Remember though that the golden rule of mod_rewrite is KISS (Keep It Simple Stupid). Before you try to implement anything complicated, check again if tweaking your website layout would make the problem go away entirely, if just a couple symlinks could easily solve the problem, if handling a whole subdirectory all at once rather than individual files would greatly simplify the problem, if the extra functionality is really necessary, if the problem can be prized apart into two separate and much much simpler problems, if the problem can be expressed more crisply, and in general if there's any easier way.
(Many times these webpages will refer to the absolute local file directory corresponding to the website root, the document root, which is of course different on different systems. This value is available in .htaccess files as %{DOCUMENT_ROOT}. But typing that is long and awkward [and possibly not as clear as it should be]. So the rest of this webpage will refer to the value of the document root path on your computer as simply rootpath.)
So what are the most basic rules of thumb for using mod_rewrite?
The rules for handling multiple mod_rewrite .htaccess files, and exactly what's presented to each one (i.e. the per-dir relativizing of filenames), are tricky. While multiple mod_rewrite .htaccess files occasionally have their place, they're not helpful in the vast majority of cases; simply avoid them. (For the infrequent cases where they're actually helpful, understand what per-dir means and exactly how it works.)
In fact, it's best to start out with mod_rewrite statements (even just RewriteEngine on/off) only in the .htaccess file at the root of the whole website. In that case, filenames will always include the full path, and relative-vs.-absolute errors will largely disappear since they'll be the same anyway.
Option +FollowSymlinks RewriteEngine on Rewrite Base / RewriteCond %{ENV:REDIRECT_STATUS} \d\d\d [OR] RewriteCond %{REQUEST_FILENAME}==%{ENV:SAVED_REQUEST_FILENAME} ^(.*?)==\1$ [OR] RewriteCond %{REQUEST_URI} ^.{300} RewriteRule ^ - [L] RewriteRule ^ - [E=SAVED_REQUEST_FILENAME:%{REQUEST_FILENAME}]
(The argument to RewriteBase should normally be the URI path to whichever subdirectory the .htaccess file is in [or you might think of it as the path from the website's root to the subdirectory the .htaccess file is in]. Use RewriteBase / in the .htaccess file in the website's root directory. If you have a complex setup with .htaccess files in subdirectories, the argument to RewriteBase should be adjusted in each one, probably to something like RewriteBase /grandparentsubdir/parentsubdir/thissubdir.)
The first mod_rewrite statement in a .htaccess file should always be RewriteEngine on, immediately followed by the other two. Even though it may not seem like it, The symlinks option really is relevant to the operation of mod_rewrite. In a few cases, to maintain security, mod_rewrite only works fully correctly if symlinks are enabled within Apache. The easiest resolution is to just turn them on all the time, even though they're often not really required in many cases. (Make sure the system-wide Apache configuration (probably file httpd.conf) doesn't try to force symlinks permanently off or disallow overriding of that option, either of which can result in symlinks not being on even though you've specified the correct line in your .htaccess file.)
(The RewriteBase / statement [the argument / would usually be different somewhere else besides the webserver's root directory] only affects relative external-redirects, and isn't really necessary because you shouldn't use such redirects anyway. However example code on the Internet uses RewriteBase / so widely it seems prudent to include it.)
(The net effect of RewriteBase / is to make relative external-redirect statements [like RewriteRule ... filename [R=301,L]] return to the client a revised URI of the form http://yourwebserver/filename rather than http://yourwebserver/grandparentpath/parentpath/filename. Unfortunately incorrect descriptions of RewriteBase -including that it has something to do with making absolute filenames relative- are so widespread it can be difficult to figure out what the directive really does. Fortunately, it doesn't matter very much, especially when mod_rewrite appears only in the website's root directory.)
If you temporarily change RewriteEngine on to RewriteEngine off, the .htaccess file will still be considered a mod_rewrite .htaccess file. The exact same .htaccess files will still be identified and processed in the exact same order. However, sometimes some mod_rewrite statements will be partially processed anyway (not just completely ignored as expected). For example some RewriteRule statements may act as internal redirects, even though RewriteEngine off has been specified and even though the statements normally act as external redirects.
First a brief explanation of one of the reasons why they happen: In .htaccess files, the [L]ast flag does not do what you probably expect it to do. In an httpd.conf context, both [N]ext and [L]ast behave exactly the way they would in PERL, either looping to the top of the file or exiting the file completely.
But in a .htaccess context, [L]ast behaves a little differently. It ends the current pass (so calling it [L]ast still makes reasonable sense). But it does not necessarily exit the file completely. If the target was changed (redirected) by the current pass (the one that encountered the [L]ast flag), execution starts all over again from the top of the file! (See some of the potential for infinite loops and duplication?)
So in a .htaccess context, [N]ext and [L]ast sound pretty much alike at first. Why are there separate flags at all? Do they really behave differently? It turns out they do behave differently ...but in rather subtle ways that may not be immediately obvious. Both always go to the top of the .htaccess file. But [N]ext always makes another pass (with existing environment variables), whereas [L]ast only makes another pass (with greatly modified and even lost environment variables) if the target was changed/redirected by the previous pass.
And what can you do to prevent this problem? (You should do at least two and perhaps all three of these, don't just choose one)
Add additional boilerplate code like this after the first three mod_rewrite lines, before any actual rewriting:
Option +FollowSymlinks RewriteEngine on Rewrite Base / RewriteCond %{ENV:REDIRECT_STATUS} \d\d\d [OR] RewriteCond %{REQUEST_FILENAME}==%{ENV:SAVED_REQUEST_FILENAME} ^(.*?)==\1$ [OR] RewriteCond %{REQUEST_URI} ^.{500} RewriteRule ^ - [L] RewriteRule ^ - [E=SAVED_REQUEST_FILENAME:%{REQUEST_FILENAME}]
Every mod_rewrite .htaccess file should include these boilerplate statements. These statements do not interfere with the other solutions, cause only minute performance degradation that isn't noticeable, and in general cause no harm. In fact, some of them aren't even activated very often. What they do is provide backstop protection against most loops. The first RewriteCond detects extra passes through the file. The second RewriteCond detects use of the [N]ext flag, which is usually erroneous. (If you legitimately and correctly use the [N]ext flag you will need to comment out this line.) The third RewriteCond detects either erroneous use of external-redirects that cause the URI to grow and grow and grow indefinitely, or requests that clearly make no sense. And the last RewriteRule sets up for the second pass through the top of the file.
(The reason for the length test is to dispatch the nonsensical request right away without applying the RE engine to ridiculously long strings. [Requests may be unreasonably long because some browser has gone nuts, or more likely because a mod_rewrite error is causing the URI to grow and grow and grow.] In the above example, the maximum reasonable length of the request was somewhat arbitrarily chosen to be 300 characters. While this is suitable in most cases, for your particular website you might need to adjust the number up or down.)
Use the [END] flag rather than the [L]ast flag in .htaccess files if that's what you really mean and if you can.
(Problem is, the [END] flag is only supported by versions 2.3.9 and later and may not be available to you and/or may not be sufficiently portable.)
Code your mod_rewrite .htaccess file cautiously.
This is always possible one way or another. But it's quite easy to accidentally goof up so what should have been just a trivial change causes an infinite loop or a double execution. Some specific ways to avoid over&8209;complication:
Avoid using the [N]ext flag at all in simple mod_rewrite .htaccess files.
Just don't. If you're still unsure, you can peruse this more thorough discussion of the [N]ext flag.
(Even in complex mod_rewrite .htaccess files, be wary of using the [N]ext flag. Consider everything else first. If you have a good use for it, be sure there's a terminating condition [RewriteCond or RE] on or before the statement with the [N]ext flag.)
Make your conditionals (both REs and RewriteConds) so specific they'll only match exactly what needs changing and will never ever match anything else (including the result of what's intended to be the last redirect).
In other words avoid the one extra time loop by always making sure the ruleset won't match an extra time, rather than by hoping the ruleset will never be executed an extra time in the first place. For example explicit loop protection sometimes looks like this:
RewriteCond !^foobar.baz$ RewriteRule foo foobar.baz [L] or perhaps (if you're an RE nerd) even like this: RewriteRule ^(?!foobar.baz$).*foo foobar.baz [L]
Give a little bit of thought (but not too much:-) to constructing crisp Regular Expressions for your conditions.
The best way to construct Regular Expressions is to stop thinking about elegance and magic and performance and the other parts of the string, and simply write exactly what you mean about only the relevant parts of the string.
Consider adding early in the .htaccess file a separate rulegroup that recognizes correct targets that need no further rewriting, and exits the rewriting process immediately so there's no possibility of accidentally making any additional changes.
For example:
# this example helps clarify the concept-- # it's specifics are awfully dumb though; # it probably shouldn't be used as is RewriteCond %{REQUEST_FILENAME} ^/*+(.*?/)/*+[^/]*$ # set %1 to path-part RewriteCond %1 -d # is path-part an existing directory? RewriteRule ^ - [L]
It will do something (occasionally even what you want:-), and it will do the same thing every time. But understanding the behavior of the combination thoroughly enough to get it to do tricky things reliably and maintainably is so weird it's better to just avoid the mixture entirely.
Specifying both Alias ... and Rewrite... statements in the same .htaccess file seems so logical: first specify with Alias ... any whole directories that have been relocated internally, then specify with Rewrite... the more esoteric remappings for individual files and uncommon cases. But unfortunately it doesn't actually work. In fact mixing Alias ... and Rewrite... statements will do something different than what you intended so often that it's best simply avoided.
mod_rewrite ([E=NAME:value]) and mod_env (SetEnv ... and SetEnvIf ...) both have full access to all the environment varables. So it's tempting to switch back and forth between mod_rewrite and mod_env when handling environment variables. But don't.
The relative execution timing of statements belonging to the two different modules is almost certainly not what you wish it were. The net result is often a variable being set after the other module has already read the empty value from it, often leading to subtle logic errors that are hard to debug.
One way to avoid this problem is to in your mind assign each variable to either mod_rewrite or mod_env, and never ever set or read that variable with the other module. (Another way to avoid this problem is to not use mod_env at all, setting environment variables exclusively with mod_rewrite.)
In simple cases the above are all the rules of thumb you'll ever need. Here's a description of some of the more nitty-gritty details that may be useful when implementing complex uses of mod_rewrite.