Web applications policy
Содержание
Files location
Most web applications use a simple single-directory setup, partly for multi-platform compatibility reasons, and partly for usability in an environment without shell access, such as web hosting. Since we don't have such constraints when producing packages for Linux, we should try to ensure an FHS-compliant setup, for the following reasons:
- security: it's ridiculous to install sensitive files under a web root, and to deny its access through the web server configuration, when just installing it somewhere outside the web root would be enough
- flexibility: FHS compliance makes it easier to reuse the same setup, notably between different virtual hosts, without duplicating the whole application
- maintainability: centralized configuration in a standard location is far easier to handle than a distributed configuration in non-standard locations intermixed with content
- consistency: all other packaged applications follow the same logic, and treating some applications differently just because they have contraints when run under different operating systems or configurations would be inconsistent
There are various strategies to achieve this objective. The simplest strategy is to use symlinks between real file locations and the place where the application expects to find them. Another strategy, cleaner but more difficult to maintain through successive upstream releases, involves patching (unless the patch could be pushed upstream). In any case, using a different setup than the default one should be explicitly documented in the package documentation.
Here are the various groups of files for a sample package called foo.
Constant files
These are the files that are not meant to change during the application's lifetime (templates, libraries, web pages, css stylesheets, etc...). They should go to either /usr/share/foo for arch-independant files, or /usr/lib/foo for arch-dependant files. They are owned by root user and root group.
Variable files
These are the files that are susceptible to change during the application's lifetime. They go in /var/lib/foo. If they are supposed to be editable by the application directly from the web interface, they should be owned by apache user and apache group. If they are just read from the interface, and modified by other means, such as cron tasks, they could safely be owned by another user and group, typically root:root.
Log files
They go to /var/log/foo, and are owned by apache user and apache group. Caution, default msec rules change all files directly under /var/log to the root user and the root group starting from level 3, so even a single log file should have its own directory to avoid breaking.
Config files
These go in /etc/foo or /etc directly and are owned by the root user and the root group. Files containing sensitive information, such as passwords, should be owned by apache group and should not be world-readable. Some applications have a web configuration interface making it mandatory to change the file owner to apache also.
Apache integration
Configuration
Location
Packages should not rely on individuals per-directory .htaccess configuration files, but instead use a single configuration file that will be dynamically included by apache at startup. This offers several advantages:
- centralized configuration is easier to maintain
- parsing the configuration once during start up is more efficient than parsing every directory for each request
- msec is able to enforce perm and ownership easily
This file should be installed as /etc/httpd/conf/webapps.d/foo.conf, owned by the root user and the root group. The %_webappconfdir rpm macro correspond to /etc/httpd/conf/webapps.d directory.
Content
The configuration should contain the following elements
- an alias, matching the application name, to the application web root (/usr/share/foo, usually)
- access permission for the application
- any additional configuration directives required
Defining a default access policy for web application is a difficult task. It's a bit unrealist to imagine fitting end users needs out-of-the-box, given the wide range of usage scenarios and different applications. From recurrent discussions on the topic, the most consensual agreement consists to define two different cases:
- applications allowing to modify state of the host computer (phpmyadmin, phpldapadmin, ...) should be restricted to local host by default
- other applications should be open to all by default
In both cases, the following concerns apply:
- using explicit access rules, rather than relying on default values and behavior
- using an explicit error message telling where the rule is defined in configuration for access denial
For more information, see the following topics in apache documentation:
Examples
Here is a default configuration suitable for a restricted access scenario:
Alias /foo /usr/share/foo <Directory /usr/share/foo> Order deny,allow Deny from all Allow from 127.0.0.1 ErrorDocument 403 "Access denied per %{_webappconfdir}/%{name}.conf" </Directory>
Here is another, suitable for an unrestricted access scenario:
Alias /foo /usr/share/foo <Directory /usr/share/foo> Order allow,deny Allow from all </Directory>
And here is a full spec file except showing how to define this configuration:
# apache configuration install -d -m 755 %{buildroot}%{_webappconfdir} cat > %{buildroot}%{_webappconfdir}/%{name}.conf <<EOF Alias /foo /usr/share/foo <Directory /usr/share/foo> Order allow,deny Allow from all </Directory> EOF ... %files %config(noreplace) %{_webappconfdir}/%{name}.conf
The genconfig script provided below automatically aggregates all .htaccess files found in the application, when called with rpm build root as an argument: genconfig rpm/tmp/foo.
Installation/Uninstallation
Apache configuration reloading is not trivial: reloading is quicker than restarting, but it will not take care of new files. Moreover, you don't want to issue two reload commands when you just update one package. And finally, you don't want error messages if apache was not running at all.
In ROSA, filetriggers take care of this automagically.
Dependencies
Standard CGI applications should just require apache, with an additional version requirement if you hardcoded the configuration file location. CGI apps are supposed to be thread-safe, so no need to specifically prefork apache core.
Applications requiring an embedded interpretor should just require the correct apache module, that will take care of requiring other apache packages:
- php applications should require apache-mod_php
- mod_perl applications should require apache-mod_perl
genconfig
#!/usr/bin/perl use File::Find; my $buildroot = $ARGV[0]; find(\&callback, $buildroot); sub callback { return unless $_ eq '.htaccess'; open(HTACCESS, $_) or die "Can't open $_: $!"; my $dir = $File::Find::dir; $dir =~ s/$buildroot//; print "<Directory \"$dir\">\n"; while (<HTACCESS>) { print "\t$_"; } print "</Directory>\n"; close(HTACCESS); }
This Policy is based on the Mandriva Web Applications Packaging Policy.