Find Jobs
Hire Freelancers

Improve webpage scrapping solution -- 3

$30-250 USD

Closed
Posted over 3 years ago

$30-250 USD

Paid on delivery
Request details I developed a Java program to scrap information from a website. The architecture of the solution involves: 1) using Java Selenium to send requests to the webpage via Chrome Webdriver to trigger authentication and authenticated requests; 2) routing the requests from Chrome (headless) to Java BrowserMobProxy to capture three HTTP headers (Authorization, X-CSRF-TOKEN, and Cookie) and one query string (without these, the server after some requests starts responding 512); and 3) use these 4 elements in HTTPs requests from Java directly to the webpage (i.e. without Selenium, Chrome, and BrowserMobProxy involved) to retrieve the desired information. This program does the basic functionality of extracting the information but has a few problems: It depends on an external non-Java component: Chrome WebDriver It depends on Java Selenium and Java BrowserMobProxy, two dependencies that I would like to remove It is not optimized (too much refresh and too long sleep periods) relatively to the limit upon which the Webpage (Cloudfare) starts responding 429 errors. Thus, the retrieval of the information is taking much more time than needed. Deliverables You will get the current program Java code and you will need to solve the problems above. To do so, you will need to: A. Find out how to authenticate and refresh the 3 headers and the query string without depending on Selenium, Chrome Webdriver, and BrowserMobProxy. As most of this data is likely generated in JavaScript, you will need knowledge about JavaScript and how to execute JavaScript from within Java or convert the JavaScript code to Java (preferable solution). B. You will need to identify the limit upon which the Webpage (behind Cloudfare) starts responding 429 errors. You will need to tune the refresh frequency of the headers and sleep periods to the limit identified. You will need to demonstrate the benefits of your changes by extracting the information currently extracted by the program and measuring how long it takes. Note: you will need to create your own login/password in the webpage. No additional requirements exist to register.
Project ID: 26818015

About the project

8 proposals
Remote project
Active 4 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
8 freelancers are bidding on average $141 USD for this job
User Avatar
Hello, I am pleasure with your job as detailed. Thank you for the job posting. It’s a pleasure to meet you. I’d really like to work with you on this one if possible! I do have a couple of questions, but first I’d like to make you an offer and some background so you can check my work out. I have been developing kind of project within 4+ years so I’m fluent experience to handle project. You’ll get all the expected stuff like a great professional service and a fast turnaround, at a bit less, and I get a bit more exposure. If the above offer sounds like something you would be interested in, I’d love to hear from you. Best regards, Adebayo
$30 USD in 1 day
4.9 (48 reviews)
6.3
6.3
User Avatar
Dear Employer, I have read the project details and confident to work on improving web scraping solutions. I have extensive knowledge on Java, javascript, python,web scraping, software architecture,etc . Kindly message me so that we can discuss more about the work. Regards Lucky
$222 USD in 3 days
5.0 (36 reviews)
5.3
5.3
User Avatar
Hello Sir! I am a web scrping expert, I think I'm a great fit for this project. because I have an interest in your project and can deliver on time, according to your specifications Thanks
$140 USD in 7 days
4.9 (8 reviews)
4.4
4.4
User Avatar
hello sir❤ I am reading your post. I am very interested to work on this post. I hope you believe in a great job on me. I hope you will handover this work to me. I hope you will be very happy to see my work. I would be very happy if you give me the job. I wish you a very happy new life. I wish you success in your endeavors ❤?
$222 USD in 4 days
0.0 (0 reviews)
0.0
0.0
User Avatar
Good day. I'm interested in your project. I have about 3 years of scraping and 6 years of python programming experience. A big plus of using python is that everything will be automated and that I can write a program quickly. I use all modern scraping libraries like: beautifulsoap, selenium, request, scrapy and so on. I can deliver you the result in a form convenient for you: json, csv, txt, sql database. I also worked on large projects, and scraped large sites such as: Amazon, Alibaba, YouTube and so on, so I know how to work with large amounts of data. If I suit you as a specialist, we can discuss the project in more detail.
$140 USD in 7 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of ROMANIA
Băilești, Romania
5.0
1
Member since Mar 8, 2020

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759) & Freelancer Online India Private Limited (CIN U93000HR2011FTC043854)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.