The new version of the term extraction tool on fivefilters.org is now in PHP.
Read the blog post explaining what’s new.
For anyone looking for a simple way to carry out term extraction on English text using PHP, here’s a snippet using the PHP port of Topia’s Term Extractor:
require 'TermExtractor/TermExtractor.php'; $text = 'Politics is the shadow cast on society by big business'; $extractor = new TermExtractor(); $terms = $extractor->extract($text); // We're outputting results in plain text... header('Content-Type: text/plain; charset=UTF-8'); // Loop through extracted terms and print each term on a new line foreach ($terms as $term_info) { // index 0: term // index 1: number of occurrences in text // index 2: word count list($term, $occurrence, $word_count) = $term_info; echo "$term\n"; }