Find Jobs
Hire Freelancers

Write some Software

$250-750 USD

In Progress
Posted over 8 years ago

$250-750 USD

Paid on delivery
Need someone to write a script to pull data out of here: [login to view URL] This is a public database, but it is a really crappy design - probably because they don't want people data mining it. For example, there are 545,309 labor condition applications filed in FY 2015. But the system will not display more than 1,000 results. This means the script that scrapes the database via this web interface must: 1. Use a key of employer names that I can provide from a separate Excel spreadsheet file that Department of Labor already provides (but is missing the data I want) 2. Enter the employer name in the form with the start and end date for certification, then press search 3. The results page will include a small JPEG that says "HTML" on it and clicking it generates an HTML page - I need the script to click on one of these "HTML" images to generate the HTML page 4. I need the script to scrape the name of the signatory, the phone number of the signatory and the e-mail address of the signatory and put that in a data file (XML, Excel, MDB, doesn't matter) in the same record as the employer's name 5. I need the script to do this probably 150,000 times or so, but it would be nice if I can customize it to do this again in the future with different "certificate date" ranges. What I want is a data file that has the company name, signatory name, signatory phone number and signatory e-mail address. What I have is a data file that has company names and this crap web interface that makes it almost impossible to get the data out of there unless you sit there and enter one company name at a time and do it manually, which will take 500 years. The developer will probably have to test the script because I don't know how fast the DOL UI will respond to inquiries, if the script can execute multiple inquiries at the same time, etc. I don't think DOL is sophisticated enough to block an IP address that is making thousands of requests, but I'm not sure. The script will have to be optimized to make inquiries fast enough to complete the data scrape before I am an old man, but slow enough to not freak out DOL's server.
Project ID: 9037260

About the project

4 proposals
Remote project
Active 8 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
4 freelancers are bidding on average $383 USD for this job
User Avatar
A proposal has not yet been provided
$412 USD in 10 days
4.9 (92 reviews)
8.1
8.1
User Avatar
Hi, I have done many scraping projects in C# & PYTHON..I have also worked with APIs for scraping as well...I have read the description & would like to discuss further..
$250 USD in 3 days
4.9 (108 reviews)
6.3
6.3
User Avatar
A proposal has not yet been provided
$316 USD in 10 days
4.9 (12 reviews)
4.5
4.5
User Avatar
Der er endnu ikke givet et forslag
$555 USD in 5 days
5.0 (1 review)
2.2
2.2

About the client

Flag of UNITED STATES
United States
0.0
0
Payment method verified
Member since Dec 3, 2015

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759) & Freelancer Online India Private Limited (CIN U93000HR2011FTC043854)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.