Semantic Web Crawler

Cancelled Posted Aug 5, 2009 Paid on delivery
Cancelled Paid on delivery

I am looking for someone to functionally design and build a website crawler in C# (.NET 3.5) that will crawl a website, gather and store certain pieces of information (meta-data and possibly "prominent" data visible in a page), and help categorize the website in one or more ways possibly including the use of full-text indexing and keyword analysis (bayesian filtering?) and possibly including analysis of search engine results of a lookup of the target website. The right person or team will have very good coding capabilities (high performance code) as well as knowledge of web crawlers, web information standards, and web information gathering methods and resources.

A significant bonus, would be to scan the website for viruses, mal-ware, etc. - possibly by leveraging a third party tool (e.g. NOD32 antivirus, or other?).

The crawler should be configurable as to crawling depth and number of links crawled.

This is an important application for a startup business and the project may lead to a long-term business relationship for the right person or team.

If you are qualified and excited to help, please send me your ideas and proposed solutions and an estimated price and timeline for the first version of the application.

Thanks and best regards,

John

## Deliverables

The first version of this application is to be discussed, designed, and built over the next few weeks. Details are to be determined and will be influenced by the coder's expertise and recommendations.

The basic goal is to take a website/URL and crawl it to discover what the site is about (e.g. books, games, social networks, etc.), provide an accurate high level description, and accurately categorize the site into a flexible and growing set of categories and based upon analysis of crawled data.

A major bonus feature, and a likely goal for the first version, would be to be able to scan the website for viruses, malware, etc. and determine if a site is "safe" for browsing. It would also be very desirable to create a "snapshot" image of the website page to be displayed along with a description.

Probably in a future version (because of timeline for first implementation), it would be ideal to be able to determine, if possible, for whom the site is appropriate (e.g. adults only, children, and even perhaps ages of children(?)).

I am looking for someone who is passionate about this stuff to develop a first version and then, ideally, help research and develop enhancements and other applications in the future.

Thanks for your interest!

Best Regards,

John

C# Programming Engineering Microsoft MySQL PHP Project Management Software Architecture Software Testing SQL Windows Desktop

Project ID: #2823607

About the project

10 proposals Remote project Active Dec 2, 2009

10 freelancers are bidding on average $669 for this job

nguyenhongv

See private message.

$1020 USD in 14 days
(2 Reviews)
4.3
TQtech

See private message.

$267.75 USD in 14 days
(30 Reviews)
4.4
OmarGamil

See private message.

$425 USD in 14 days
(8 Reviews)
3.5
prosolutionvw

See private message.

$425 USD in 14 days
(8 Reviews)
4.6
sshramhrvw

See private message.

$425 USD in 14 days
(0 Reviews)
0.0
vw7087026vw

See private message.

$127.5 USD in 14 days
(0 Reviews)
0.0
hungrymindvw

See private message.

$850 USD in 14 days
(0 Reviews)
0.0
chilwal

See private message.

$850 USD in 14 days
(0 Reviews)
0.0
DmitriSlam

See private message.

$1020 USD in 14 days
(0 Reviews)
0.0
galdaz

See private message.

$1275 USD in 14 days
(0 Reviews)
0.0