Drupal on Glassfish with clean urls using Url Rewrite Filter

You have a Glassfish server.

You are a Drupal developer.

You want to run Drupal in Glassfish. More importantly, you want to have it use clean urls because without that capability, all of your urls look like this: /index.php?foo/bar/baz. Which sucks, of course.

Let's set aside for the moment the desirability of running PHP in a Java application container for the moment1, and jump right to the meat of this geeky post. Here's how I got some of the clean url functionality you'd normally get from Apache's mod_rewrite, or using either mod_rewrite or Lua in lighttpd.

I'm assuming you've already got Glassfish installed, so from there:

  1. Get a copy of Quercus, Caucho's Java implementation of PHP 5. I downloaded the version 3.2.1 .war file.
  2. Unzip the .war:2
    jar -xvf quercus-3.2.1.war
  3. Get a copy of Url Rewrite Filter. I used version 3.2.0 (beta), but 2.6 should work also.
    > cd quercus-3.2.1
    > wget http://urlrewritefilter.googlecode.com/files/urlrewritefilter-3.2.0-src.zip
    > unzip urlrewritefilter-3.2.0-src.zip
  4. Get Drupal. I used the latest 6.x version.:
    > cd ../
    > wget http://ftp.drupal.org/files/projects/drupal-6.10.tar.gz
    > tar zxvf drupal-6.10.tar.gz
     
    Copy Drupal files to the quercus docroot
    > cp -r drupal-6.10/* quercus-3.2.1/
  5. Configure Url Rewrite Filter. This is where it gets a little sketchy. Drupal comes prepackaged with a .htaccess file, which sets up the mod_rewrite rules for Apache:
    # Rewrite current-style URLs of the form 'index.php?q=x'.
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]

    Unfortunately, Url Rewrite Filter doesn't support the REQUEST_FILENAME directive (yet). So I've put in some hacks to at least get clean urls working. I don't pretend that this is production-ready, but it gets it working for basic testing. If anyone has input, I'd welcome it. Anyway, in the WEB-INF/web.xml file, the following directives need to be added:
    <filter>
      <filter-name>UrlRewriteFilter</filter-name>
      <filter-class>org.tuckey.web.filters.urlrewrite.UrlRewriteFilter</filter-class>
    </filter>
    <filter-mapping>
      <filter-name>UrlRewriteFilter</filter-name>
      <url-pattern>/*</url-pattern>
    </filter-mapping>

    This should go before the <servlet></servlet> section.

    Next, create WEB-INF/urlrewrite.xml:

    <?xml version="1.0" encoding="utf-8"?>
    <!DOCTYPE urlrewrite
    PUBLIC "-//tuckey.org//DTD UrlRewrite 2.6//EN"
    "http://tuckey.org/res/dtds/urlrewrite2.6.dtd">
    <urlrewrite>
      <rule>
        <note>
          Prevent rewriting of specific files. Definitely not
          the best way to do this.
        </note>
        <from>^/(.*)(css|js|png|jpg|gif)$</from>
        <to>/$1$2</to>
      </rule>
      <rule>
        <note>
          Prevent rewriting of files in the files directory
        </note>
        <from>^/(.*)/files/(.*)$</from>
        <to>/$1/files/$2</to>
      </rule>
      <rule>
        <note>
          Do the Drupaly stuff
        </note>
        <from>^/(.*)$</from>
        <to>/index.php?q=$1</to>
      </rule>
    </urlrewrite>
  6. Re-zip your directory back into a .war file to deploy onto the app server:
    >jar -cvf quercus-3.2.1.war quercus-3.2.1/*
  7. Deploy to Glassfish through the admin console, or from the auto-deploy directory. Note that you should set your context-root to / to run Drupal at the root of the app server.
  8. Oh yeah, you'll probably want to connect to a database too, right? In the admin panel, go to Resources > JDBC > Connection Pools and create a new Connection Pool. Let's call it mysqlpool.

    Set Datasource Classname to com.mysql.jdbc.jdbc2.optional.MysqlConnectionPoolDataSource.

    Plug in the following Additional Parameters for your Drupal DB:

    password
    user
    databaseName
    portNumber
    serverName

That should be about it.

1: There are a couple of reasons I think this is interesting. First, Caucho claims that Quercus should run PHP apps as least as fast as Apache + APC, and I've seen numbers of 24-56% faster for the same complex app. Second, you can use Java functions natively from within PHP. Working in an environment with a lot of Java code, that's potentially appealing, especially for complex scientific functions I don't really want to re-implement in PHP. Third, the ability to deploy a .war file containing the whole Drupal docroot is also interesting.

2: I expect that most Drupal developers aren't using IDEs like Eclipse or NetBeans, so I'm just using command-line tools here.

Comments

This rewrite is inadequate. I've been getting incredibly frustrated finding a way to make clean URLs work with Drupal on Quercus for this reason.

Specifically URLs like:

http://localhost:8080/batch?op=start&id=4
http://localhost:8080/admin_menu/flush-cache?destination=admin

fail miserably. There has to be a better way to emulate mod_rewrite in a servlet container. I just can't find it for the life of me. I tried updating the regex for the latter case to:

  &lt;rule>
    &lt;note>
      Do the Drupal stuff
    &lt;/note>
    &lt;from>^/(.*)|(.*)(\?(.*))$&lt;/from>
    &lt;to>/index.php?q=$1&$4&lt;/to>
  &lt;/rule>

This looks too ugly to be very right though. I'm no reg-expert.

I'm almost to the point of getting a separate hosting account just to use Apache for this and proxy calls to my Java host. That not only sounds terribly annoying, but probably won't help with .htaccess rules like:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d

as these require access to the file system.

Has anyone found a better way?

Christian's picture

Yeah, I've found after playing with Quercus a bit more that it's not quite ready for production, perhaps maybe for very specific use cases.

For example, it doesn't support LDAP and some classess in SPL, which make it unusable for my applications.

I've also found it to be not quite as fast as the lighttpd+APC combo we run in production. At this point, I'd probably only use it for things that were heavily integrated with java code.

I am posting this quite late.. But still it may be of use to someone .
Well i was trying to deploy my drupal site on glassfish and faced with the clean url problem. This post really helped me. But as Todd, in the above post, has mentioned we need to mention these conditions
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d

So this is wat i have done

- Removed the previous 2 rules
- Replaced the 3rd rule with

Do the Drupaly stuff

^/(.*)/files/(.*)$
^/(.*)(css|js|png|jpg|gif)$
^/(.*)$
/index.php?q=$1

It works great !!

Eww... Sorry for the above post.. The tags just disappeared...
Replace the 3rd rule with:
<rule>
<note>
Do the Drupaly stuff
</note>
<condition type="request-uri" operator="notequal">^/(.*)/files/(.*)$</condition>
<condition type="request-uri" operator="notequal">^/(.*)(css|js|png|jpg|gif)$</condition>
<from>^/(.*)$</from>
<to>/index.php?q=$1</to>
</rule>

a simple clean url standard servlet filter:

package org.drupal.servlet;
 
import java.io.*;
import java.util.ArrayList;
import java.util.Enumeration;
import java.util.HashMap;
import java.util.Map;
 
import javax.servlet.*;
import javax.servlet.http.*;
 
public class CleanUrlServletFilter implements Filter
{
  private ServletContext ctx;
 
  public void init(FilterConfig config)
  throws ServletException
  {
	    ctx= config.getServletContext();
  }
 
  public void doFilter(ServletRequest request,
                       ServletResponse response,
                       FilterChain nextFilter)
  throws ServletException, IOException
  {
    final HttpServletRequest req= (HttpServletRequest)request;
    String realPath= ctx.getRealPath(req.getServletPath());
 
    if (new File(realPath).exists())
        nextFilter.doFilter(request, response);
    else
    {
    	HttpServletRequest newreq= new HttpServletRequestWrapper(req)
    	{
    		private Map parameters= new HashMap(req.getParameterMap());
    		private String q= req.getServletPath().substring(1);
 
    		{
    			parameters.put("q",q);
    		}
 
    		public String getParameter(String name) 
    		{
    			Object v= parameters.get(name);
 
    			if (v instanceof String[])
    			  return ((String[])v)[0];
    			else
      			  return (String)v;
    		}
 
    		public Enumeration getParameterNames() 
    		{
    			final Enumeration e= super.getParameterNames();
    			return new Enumeration() 
    			{
                    private boolean done= false;
 
					public boolean hasMoreElements() {
 
						if (!done)
					     return true;
						else
						 return e.hasMoreElements();
					}
 
					public Object nextElement() {
 
						if (!done)
						{
							done= true;
							return "q";
						}
						else
						  return e.nextElement();
					}
 
				};
    		}
 
    		public Map getParameterMap() 
    		{
    			return parameters;
    		}
 
    		public String[] getParameterValues(String name) 
    		{
    			Object v= parameters.get(name);
 
    			if (v instanceof String[])
    			  return (String[])v;
    			else
      			  return new String[]{(String)v};
    		}
 
    		public String getQueryString() {
    			return "q="+q+(super.getQueryString()==null? "" : "&"+super.getQueryString());
    		}
    	};
 
    	ctx.getRequestDispatcher("/index.php").forward(newreq, response);
    }
 
  }
 
  public void destroy()
  {
  }  
 
}

then just map it in your web.xml:

<filter>
	    <filter-name>CleanUrlServletFilter</filter-name>
	    <filter-class>org.drupal.servlet.CleanUrlServletFilter</filter-class>
    </filter>
    <filter-mapping>
        <filter-name>CleanUrlServletFilter</filter-name>
        <url-pattern>/*</url-pattern>
    </filter-mapping>
Christian's picture

Thanks Andrea! I'll have to give that a try.