Program PHP crawler that fetches category directories and stores structure in mySQL.

In Progress Posted Sep 5, 2004 Paid on delivery
In Progress Paid on delivery

We need a small PHP script for a concept demonstration. The script will spider all the category and sub-category names, upto 4 levels deeplinked, and in 3 languages English + German + French of www.kompass.com. Only the categories - without the level where the firms appear. The category depths range from 2 to maximum 4 levels.

This PHP script must store all the retrieved category names in their logical order into a mySQL database (the sub-categories must be referenced through a numbering system to their relevant top level categories). The result is 3 tables of complete category trees (separated by the 3 languages) in one mySQL database.

Your work includes the programming and successful testing of the PHP script (PHP 3, 4 or 5 is not important) as well as the definition of the mySQL tables/fields that the script requires to store the category names. When completed and running, you send the PHP script together with the "create table" commands for the database. Willing to pay maximum $250.- as it's only for a demonstration.

Note: If you cannot provide examples of past experience in PHP-driven website crawler/spider programming or do not fully understand this project, please do not bother to bid. We are not interested in design services or catalogue apps, this is a specific task for someone with experience in exactly this field.

Another note: If you don't know what a spidering/storage mechanism does, check http://shuetech.com/minetheweb as an example. What we need is much smaller and just for 1 URL to have the data (category trees) ready for demo-ing a new type of search engine that finds categories.

PHP

Project ID: #5361

About the project

4 proposals Remote project Active Sep 5, 2004