Find Jobs
Hire Freelancers

Mirror/crawl/duplicate an entire (large) website -- 2

$250-750 USD

In Progress
Posted about 4 years ago

$250-750 USD

Paid on delivery
I need to mirror an entire website - it will be at least a 10gb, but could be up to 50gb or more. So the task is to successfully mirror it, BUT knowing that: wget is not successful in the task. HTTRACK is not successful in the task. It is more complex than providing a URL pattern for only pages I want, because I don't quite know what I need yet. Some pages contain audio files. I want the same effect one would have by manually navigating the site and saving every page as a .html file, WITH THE FOLDER that contains all the media files embedded in the html. Once you identify a way to mirror it, it might take a few weeks of the server/computer running. If there's a way to run a process that is hitting it with multiple AWS instances (or similar) and getting the whole thing quicker, that would be great. It is a huge site and I want the entire thing, but a single process will take forever. Then you could just put the files on a server, and I will download them. If you message me then I will tell you the website. I can provide some example pages from it that I would want to make sure your mirror task will capture. Thanks.
Project ID: 23828817

About the project

20 proposals
Remote project
Active 4 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
Awarded to:
User Avatar
Hi To Mirror a site, you need to download the HTML pages, scan the pages and then download from all the linked media and find all the html pages and then repeat for them. This can be done via requests and Beautiful Soup. Requests is the most popular Python library and is used to download the files. BeaultifulSoup is used to parse the html and find the links etc. If the site is large, I guess the best route would be to go through the site, scanning for all the links, storing non html links and following the html links. That will then give us a list of the media files of various sorts, that we can then download in a separate action. Since the url also specifies the directory structure, I can use that to build the structure for you to mirror., Thanks Marc Nealer
$710 USD in 7 days
4.9 (14 reviews)
6.1
6.1
20 freelancers are bidding on average $533 USD for this job
User Avatar
Hi, I normally don't trust projects asking to mirror website instead of parsing it, but let's see. hitting it with multiple instances won't be a problem if I'll be interested in doing this. Thanks
$750 USD in 5 days
5.0 (264 reviews)
8.7
8.7
User Avatar
Dear Sir, I'm very much delighted to let you know that I've been doing data scraping with PHP-cURL, PhantomJS, Node.js, Selenium from many sites. I just scraped the data from web site and then wrote the data in mysql database or excel or csv or xml file. I worked on many similar projects, I have big experience in data mining projects. I have written hundreds of web scrapers which scrape millions of pages each day. I'm ready to fulfill your requirement. I can finish this task in short time, with the best quality. I can assure 100% accuracy. Please give me the opportunity to do the work. With Kind Regards, Debdulal Roy
$750 USD in 15 days
4.9 (76 reviews)
7.3
7.3
User Avatar
Hi, I have read your requirement carefully and I can do this project. I have over 7+ years of experience along with in-depth technical knowledge on Python, Flask, Node.js, JavaScript, PostgreSQL Administration, Django, MySQL, MongoDB, XML-RPC services, Data Analysis, Web scrapping, ERP/CRM portals and implement the fully innovative and creative strategy to deliver with outstanding functionalities and user experiences. I have developed a variety of Software Applications including Hospital Management System, University Management System, Inventory Management system and many others. Developed dynamic Websites, and crypto trading indexing. Specialities: Database systems(MySQL, PostgreSQL), Can Program fluently in ML, DL, AI, Python, and PHP. Proficiency in Odoo software development. Knowledge of IT Security and Assurance, Linux System Administrator, Crypto, and Binary Options Trading with more than 10x return. Extensive experience with planning to design and configuring scalable, highly available and redundant networks on Amazon Web Services cloud platforms. Message me to get more details about my relevant experience or provide info about your needs! Looking forward to hearing from you. Thanks
$600 USD in 15 days
4.9 (6 reviews)
5.2
5.2
User Avatar
I am writing this proposal in order to work for you in Software and Web Development. We are highly trained professional developers seeking to freelance and earn online. Having a flair in programming and development I have been excellent with JAVA,C#,C/C++ and PHP programming language along with MySQL with XAMPP Server. I also build application for Cloud computing and High performance computing as I also have knowledge regarding Hadoop, MPI, OpenMP (distributed frameworks) and also have build applications in CUDA C++ and OpenCL. I also have experience with the most famous framework ASP.NET. My most expertise are in JAVA where I have built countless projects for semester projects and Final year project. You may find many developers in this field however we assure that you will not be able to find a team like us. We not only ensure the code is quality wise but we also assure that the code we write are optimised and we ensure that the program performs right operation under right environment i.e. we create programs that are defect free. You may also find freelancers that cost low but they do not put their 100% which then shows in the software and leads to an unhappy customer. We ensure reasonable price are put as tag on the job and we ensure in order to give the write product. We highly appreciate your time, if you are interested kindly let me know
$500 USD in 7 days
4.9 (7 reviews)
3.5
3.5
User Avatar
Hi, My name is Sandeep, I am senior Mobile and Web developer with 8+ years of rich experiences. I have good skills in Python, React Native, Angular, Php and Javascript. ________________________________________________ Single point of contact to deliver end-to-end: _________________________________________________ - Custom Software and Mobile App Development - Web development, CMS based solutions, JSON, 3rd party API’s like Google maps - Business Application and Web Portals - Proficient in Platforms like NodeJs, ReactJS, React Native, Python for small and medium global customers. - PHP, Codeingnitor, HTML, CSS, Angular, jQuery, Javascript etc Our clients have come to rely on our expertise, reliability, and speed for bringing innovation to reality. Let's start the chat so that we can discuss more on the project. Looking forward to hearing from you soon. Thanks and Regards Thanks & Regards Sandeep Gupta
$500 USD in 7 days
5.0 (1 review)
3.5
3.5
User Avatar
Hello, I'm a full-stack developer with extensive knowledge of Python with an experience of 7+ years. Here's a sample of Backend/DevOps tasks I can help you with: - Design and implement REST APIs in Flask/Postgres, documented with Swagger, with graphql support (using graphene) - Dockerise applications and deploy them to AWS ECS, setting up load balancing and auto scaling - Move infrastructures to code using Terraform. - Integrate apps in AWS EC2, RDS, S3, ElasticCache ... - Build CI/CD pipelines for testing and automation, recommending best practices like semantic versioning and changelog automation. - Implement Monitoring/Logging for Datadog and ElasticCache, including custom instrumentation for APM. - Automate existing workflows in Python Let's connect to discuss the details. Regards, Mishal
$500 USD in 7 days
5.0 (1 review)
2.8
2.8
User Avatar
Hi There! I have more than 8-year experience in this field. Would you please share more details about the project? I am really interested to work with you for long. Best Regards, Abhishek
$500 USD in 7 days
3.6 (5 reviews)
2.1
2.1
User Avatar
I'm a CEO of a Software house. We have about 30 Employees with minimum 6 years of experience. We will give you the best most accurate work in about 3 days in which we will also provide you with the demo of the website!
$300 USD in 3 days
0.0 (0 reviews)
0.0
0.0
User Avatar
Hello, can you please provide those examples of some pages you said in the project description? I am an expert in web scrapping and server management, so I hope I can find a faster way to reduce download time. Thanks in advance
$500 USD in 7 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of UNITED STATES
new york, United States
5.0
3
Payment method verified
Member since Apr 15, 2016

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759) & Freelancer Online India Private Limited (CIN U93000HR2011FTC043854)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.