Jump to: navigation, search

README

This README describes GeSHiWrapper.php: a MediaWiki extension that provides a
wrapper around GeSHi, the Generic Syntax Highlighter
(http://qbnz.com/highlighter/) so that tags are available in wiki markup that
syntax-highlight the blocks of code/text included between them, or that
syntax-highlight files sourced from the wiki server's local filesystem.  This
extension supports settings to restrict access to parts of the local filesystem
- by default all access is disabled.

This extension also supports the optional use of dual versions of GeSHi, so
that the new features of a 1.1.x alpha release can be used, with fall-back
to an older 1.0.x release for language formats that aren't yet supported in
the 1.1.x alpha releases.  There's a slight restriction with this approach
that can be worked around - see below.

QUICK INSTALL

+ Copy this README and GeSHiWrapper.php into mediawiki/extensions/GeSHiWrapper,
  first creating that directory if required.
+ Download GeSHi from
  http://sourceforge.net/project/showfiles.php?group_id=114997
  and decompress and copy it to 'extensions/geshi' (the doc and contrib
  directories can optionally be omitted)
+ Add to LocalSettings.php:
include('extensions/GeSHiWrapper/GeSHiWrapper.php');

DUAL-VERSION INSTALL

+ Install your primary version (e.g. a 1.1 alpha-series release) of GeSHi
  as described above.
+ Install your fall-back version (e.g. 1.0.8.6) into a sub-directory named
  differently according to your choice (e.g. geshi-1.0.8.6) and add to
  LocalSettings.php:
$wgGeSHiFallbackDir = '<chosen-sub-directory-name (e.g. geshi-1.0.8.6)>';

DUAL-VERSION RESTRICTION AND WORKAROUND

If a page contains more than one code block, and if any code block(s) after
the first would be highlighted by a different version of GeSHi than the
first, then those blocks will not be highlighted and will be prepended with an
explanatory html comment.  This is because it's not possible to re-define the
GeSHi class.  A workaround for anyone who finds this problematic is to patch
one of the GeSHi versions by renaming its GeSHi class to something else, to
rename constants that conflict between the versions, and to set
$wgGeSHiClassName or $wgGeSHiFallbackClassName - as appropriate - in
LocalSettings.php to the new class name.

An example unrestricted workaround scheme
-----------------------------------------
The following scheme has been briefly tested using geshi-1.0.8.6; it seems to
be fully-functional.

To LocalSettings.php add:
$wgGeSHiFallbackDir = 'geshi-1.0.8.6';
$wgGeSHiFallbackClassName = 'GeSHi__1_0_8_6';

Then, EITHER, run this (in-place) sed command in the geshi-1.0.8.6 directory:
sed -i 's/GESHI_VERSION/GESHI_VERSION__1_0_8_6/;
 s/function GeSHi/function GeSHi__1_0_8_6/;
 s/class GeSHi/class GeSHi__1_0_8_6/;
 s/GESHI_ROOT/GESHI_ROOT__1_0_8_6/;
 s/new GeSHi/new GeSHi__1_0_8_6/;
 s/GeSHi::/GeSHi__1_0_8_6::/' geshi.php

OR (e.g. if you are not running a UNIX variant and do not have access to the
'sed' command), perform a manual search and replace (e.g. in a text editor)
based on the command: i.e. in the file geshi.php in the geshi-1.0.8.6 directory,
for each line of the above sed command, globally replace all occurrences of the
first token enclosed by forward slashes with the second token enclosed by
forward slashes.

EXAMPLE USAGE

Insert the following into the wiki-content of an article:
<php>$insert_php_code_block_to_be_highlighted;</php>
OR, equivalently
<source lang="php">$insert_php_code_block_to_be_highlighted;</source>
OR (disabled by default - see security section below):
<php-file>local/path/on/wiki/host/filename</php-file>
OR, equivalently (default-disabled as above)
<source lang="php" file="local/path/on/wiki/host/filename" />

For 'php' could be substituted any one of the languages found in the
geshi/geshi directory (1.0.x) or the geshi/geshi/languages directory (1.1.x);
the tag <div> is disabled because it conflicts with an existing html tag.

ESCAPING TAGS

MediaWiki's parser often interprets tag-like tokens within the block being
highlighted; to deal with this it's possible to specify "opentagescape" as a
parameter to the "source" tag, and to set the value of this parameter to a
string that has been used within the code as a replacement for the problematic
"<".  Prior to highlighting, this string will be replaced with "<" everywhere
that it's found in the block to be highlighted.

SECURITY AND FILE VIEWING PERMISSIONS

By default the <*-file> tags are disabled by restriction variables.  This is
because in the absence of restriction, any file to which PHP code run on the
webserver has read access could be viewed using the <*-file> mechanism -
this includes the MediaWiki code itself as well as e.g. (assuming a default
Unix + apache setup) /etc/passwd if it's readable by the apache group.

The restriction variables
-------------------------
Two variables operate simultaneously (the most restricted result after
applying both variables is used) to control which files are accessible -
a regex blacklist and an allowed-sub-trees list. These variables can be set
in LocalSettings.php.  They are:
$wgGeSHiFileBlacklistREs: an array of regex's operating on the original path
  Default value == array('') # block all
  Special value == array()   # block none
$wgGeSHiFileAllowedDirs: an array where each entry is the absolutely-
   specified top-level directory of a tree within which access is allowed.
   It operates against the calculated real path (i.e. after resolving '..'s
   and symbolic links)
  Default value == array()   # allow none
  Special value == array('') # allow all

Clarification/reminder: filesystem permissions that block access will
 override these variables.

Enabling unrestricted <*-file> tags (not recommended)
-----------------------------------------------------
$wgGeSHiFileBlacklistREs = array();   # block none
$wgGeSHiFileAllowedDirs  = array(''); # allow all

Restricting to a set of directory trees
---------------------------------------
$wgGeSHiFileBlacklistREs = array();   # block none
$wgGeSHiFileAllowedDirs  = array('/var/www/htdocs', '/usr/src/linux');

Restricting only absolutely-specified paths (not recommended)
-------------------------------------------------------------
$wgGeSHiFileBlacklistREs = array('^/'); # block paths starting with /  N.B.
                             # this will not block relative access using ..
$wgGeSHiFileAllowedDirs = array('');    # allow all

Restricting to a set of directory trees except for certain files
----------------------------------------------------------------
$wgGeSHiFileBlacklistREs = array('private\.txt', '\.settings') # escape periods
$wgGeSHiFileAllowedDirs  = array('/var/www/htdocs', '/usr/src');


GeSHiWrapper.php

<?php
/**
 * GeSHiWrapper.php
 *
 * A MediaWiki extension that supports syntax highlighting in wiki-text.
 *
 * See the README for installation, configuration and usage notes.
 *
 * CREDITS
 *
 * Adapted from GeSHiHighlight.php, written by:
 *   Andrew Nicol, http://www.nanfo.com
 * who writes that his extension was a modification of one by:
 *   Coffman (www.wickle.com)
 * and that Coffman's had been later modified by
 *   E. Rogan Creswick (aka: Largos), largos@ciscavate.org, ciscavate.org/wiki/
 * Additional inspiration drawn from Brion Vibber's SyntaxHighlight.php.
 *
 * Licenced under the General Public Licence 2, without warranty.
 *
 * @licence GPL2 http://www.gnu.org/copyleft/gpl.html
 * @author http://clc-wiki.net/wiki/User:Netocrat
 * @version 1.16
 */

$wgExtensionFunctions[] = 'wfGeSHiSetupTags';

function wfGeSHiSetupTags() {
	global $wgParser, $IP, $wgGeSHiFallbackDir, $wgGeSHiFileBlacklistREs,
	  $wgGeSHiFileAllowedDirs, $wgGeSHiFallbackClassName, $wgGeSHiClassName,
	  $wgExtensionCredits;

	$baselangs = wfGeSHiDetect($IP.'/extensions/geshi', $include);
	if ($wgGeSHiFallbackDir) {
		$fblangs = wfGeSHiDetect("$IP/extensions/".
		  $wgGeSHiFallbackDir, $fbinclude);
	} else {
		$fblangs = array();
		$fbinclude = '';
	}

	# In lieu of DefaultSettings.php entries
	if (!isset($wgGeSHiFileAllowedDirs)) $wgGeSHiFileAllowedDirs = array();
	if (!isset($wgGeSHiFileBlacklistREs)) {
		$wgGeSHiFileBlacklistREs = array('');
	}
	if (!isset($wgGeSHiClassName)) $wgGeSHiClassName = 'GeSHi';
	if (!isset($wgGeSHiFallbackClassName)) {
		$wgGeSHiFallbackClassName = 'GeSHi';
	}

	$tagdata = array($include => array($wgGeSHiClassName, $baselangs),
	               $fbinclude => array($wgGeSHiFallbackClassName,
	                              array_diff((array)$fblangs, $baselangs)));
	$wgParser->setHook('source', 'wfGeSHiSource');
	// Initialise static variable in wfGeSHiSource
	wfGeSHiSource($tagdata);

	# div conflicts with an html tag and causes parser problems
	$tagdata[$include][1] = array_diff($tagdata[$include][1], array('div'));
	$tagdata[$fbinclude][1] = array_diff($tagdata[$fbinclude][1],
	  array('div'));

	wfProfileIn('wfGeSHiSetupTags-set-parser-hooks');
	foreach ($tagdata as $inc => $classlang) {
		list($classname, $langs) = $classlang;
		foreach ($langs as $lang) {
			$wgParser->setHook($lang, function($text) use ($lang, $inc, $classname) {return wfGeSHiHighlight($lang, $text, $inc, $classname);});
			$wgParser->setHook($lang.'-file', function($file_name) use ($lang, $inc, $classname) {return wfGeSHiHighlightFile($lang, $file_name, $inc, $classname);});
		}
	}
	wfProfileOut('wfGeSHiSetupTags-set-parser-hooks');

	$wgExtensionCredits['other'][''] = array(
	'name' => 'GeSHiWrapper',
	'url' => 'http://clc-wiki.net/wiki/Project:Config:Wiki:GeSHiWrapper',
	'description' => 'Adds tags that use [http://qbnz.com/highlighter/ '.
	  'GeSHi] to syntax-highlight enclosed text or a specified local file '.
	  'as one of the following languages: '.wfGeSHiLangList($tagdata));

}

function wfGeSHiLangList($tagdata) {
	$a = array_values($tagdata);
	# assuming a primary and a fallback and no others
	$all = array_merge($a[0][1], $a[1][1]);
	sort($all);
	return implode(', ', $all);
}

function wfGeSHiDetect($basedir, &$include) {
	global $wgMemc, $wgDBname, $wgDBprefix;

	$key = "$wgDBname:$wgDBprefix:geshidetect:$basedir";
	$ret = $wgMemc->get($key);
	$cacheHit = $ret ? true : false;
	if ($cacheHit) {
		wfDebug(__FUNCTION__.": LOADED '$key' from cache\n");
	} else {
		wfDebug(__FUNCTION__.": '$key' NOT FOUND in cache.\n");
		$ret = array();
	}
	$dir = "$basedir/geshi/languages";
	if (is_dir($dir)) {
		# GeSHi 1.1.x
		$include = "$basedir/class.geshi.php";
		if (!$cacheHit) {
			foreach(glob("$dir/*") as $langdir) {
				$d = basename($langdir);
				# ignore CVS checkout-data directories
				if (is_dir($langdir) && $d != 'CVS') $ret[] = $d;
			}
		}
	} else if (!file_exists($basedir)) {
		die("wfGeSHiDetect: \"$basedir\" not found.");
	} else {
		# GeSHi 1.0.x
		$include = "$basedir/geshi.php";
		if (!$cacheHit) {
			$dir = "$basedir/geshi";
			foreach(glob("$dir/*.php") as $langfile) {
				if (is_file($langfile)) {
					$ret[] = basename($langfile, '.php');
				}
			}
		}
	}
	if (!$cacheHit) {
		$wgMemc->set($key, $ret);
		wfDebug(__FUNCTION__.": saved '$key' to cache.\n");
	}

	return $ret;
}

function wfGeSHiSource($text, $params = array()) {
	static $tagdata = null;
	if (is_array($text)) $tagdata = $text;
	else {
		$def = 'c';
		$hfunc = 'wfGeSHiHighlight';
		if (isset($params['file']) && $params['file']) {
			$text = $params['file'];
			$hfunc .= 'File';
		}
		if (isset($params['opentagescape'])) {
			$text=str_replace($params['opentagescape'], '<', $text);
		}
		if (isset($params['lang'])) $lang = $params['lang'];
		else {
			$lang = $def;
			$dbg = "<!--lang attribute unset; assuming '$def'-->\n";
		}
		foreach ($tagdata as $inc => $classlang) {
			list($classnm, $langs) = $classlang;
			if (in_array($lang, $langs)) {
				$include = $inc;
				$classname = $classnm;
				break;
			} else if ($lang != $def && in_array($def, $langs)) {
				$definc = $inc;
				$defclass = $classnm;
			}
		}
		if (!isset($include)) {
			$dbg = "<!--'$lang' not supported; assuming '$def'; ".
			  "supported languages are: ".wfGeSHiLangList($tagdata).
			  "-->\n";
			$lang = $def;
			$include = $definc ? $definc : $inc;
			$classname = $definc ? $defclass : $classnm;
		}
		$ret = $hfunc($lang, $text, $include, $classname);
		if (isset($dbg)) $ret = $dbg.$ret;
		return $ret;
	}
}
function wfGeSHiHighlight($lang, $text, $include, $classname) {
	global $wgGeSHiFallbackClassName, $wgGeSHiClassName;
	static $included = '';

	if ($wgGeSHiClassName == $wgGeSHiFallbackClassName &&
	  class_exists($classname) && $included != $include) {
		return "<!-- Unable to syntax-highlight the following block ".
		  "- a different version of the GeSHi class has already been ".
		  "used whilst highlighting this page -->\n".
		  "<pre>\n$text\n</pre>\n";
	} else {
		include_once($include);
		$included = $include;
		if ($text{0} == "\n") {
			$text = substr($text, 1);
		} else if ($text{0} == "\r" && $text{1} == "\n") {
			$text = substr($text, 2);
		}
		$geshi = new $classname($text, $lang);
		return wfSyntaxDefaults($geshi);
	}
}

function wfGeSHiHighlightFile($lang, $file_name, $include, $classname) {
	global $wgParser, $wgGeSHiFileAllowedDirs, $wgGeSHiFileBlacklistREs;

	/* This doesn't disable file caching when $wgUseFileCache == true and
	 * nor does it disable client caching when $wgCachePages == true.
	 * MediaWiki doesn't provide a simple mechanism to do either of those so
	 * that they persist on subsequent page views after the first parse.
	 * In these situations, a user must manually purge the page by
	 * accessing it with action=purge as a url query parameter.
	 */
	$wgParser->disableCache();

	if ($wgGeSHiFileBlacklistREs === array('')) {
		$allowed = false;
	} else if ($wgGeSHiFileBlacklistREs) {
		$re = '/('.implode(')|(', (array)$wgGeSHiFileBlacklistREs).')/';
		$allowed = !preg_match($re, $file_name);
	} else	$allowed = true;
	if ($allowed && $wgGeSHiFileAllowedDirs &&
	  ($wgGeSHiFileAllowedDirs !== array(''))) {
		$allowed = false;
		$realpath = realpath($file_name);
		if ($realpath) {
			foreach ((array)$wgGeSHiFileAllowedDirs as $dir) {
				$n = strlen($dir);
				if ($n && $dir{$n - 1} != '/') {
					$dir .= '/';
					$n++;
				}
				if (substr($realpath, 0, $n) == $dir) {
					$allowed = true;
					break;
				}
			}
		}
	}
	if ($allowed) $text = file_get_contents($file_name);
	if (!$allowed || $text === false) {
		$text = "wfGeSHiHighlightFile: \"$file_name\" is unreadable ".
		  "or is restricted from '$lang-file' tag includes.\n";
		// this is commented out because it currently causes an error
		// in GeSHi as it exists in CVS
//		$lang = '';
	}
	return wfGeSHiHighlight($lang, $text, $include, $classname);
}

function wfSyntaxDefaults($geshi) {
	$parseCode = method_exists($geshi,'parseCode')?'parseCode':'parse_code';
	wfProfileIn('wfSyntaxDefaults');
	$ret = $geshi->$parseCode();
	wfProfileOut('wfSyntaxDefaults');
	return $ret;
}

?>
Personal tools