
Open
Posted
•
Ends in 16 hours
Paid on delivery
WE WILL NOT NEGOTIATE WITH ANYONE FOR MORE THAN $300 I require a robust, reliable, and automated scraping solution to extract a comprehensive inventory of approximately 250,000 automotive parts from ulti.cl. The primary objective is to generate a clean, well-structured dataset of the entire catalog—which may fluctuate in size—for analytical purposes. The Core Challenge The site structure does not allow for a simple menu-based category crawl. Many products are not properly categorized or indexed within the navigation menus, and relying on category trees will result in missing a large portion of the inventory. Therefore, the extraction strategy must bypass traditional navigation and rely on search-query simulation and site-wide index traversal. You will need to implement techniques such as: Search Simulation: Automating the internal search engine by cycling through relevant automotive keywords, part numbers, or character combinations to surface all 250k+ items. High-Volume Management: Given the scale, the architecture must support persistent storage (e.g., SQLite or PostgreSQL) to handle the volume efficiently without memory bottlenecks. Dynamic Anti-Bot Handling: Implementing rotating residential proxies, user-agent randomization, and request throttling to sustain high-speed extraction across such a large dataset without being blocked. Scope of Data Text Data: Product name, complete technical specifications/descriptions, current price, and stock status for each of the ~250,000 products. Images: Primary product images saved locally, ensuring high-quality, watermark-free versions. Metadata: SKU, part compatibility, and source URL. Technical Stack & Requirements Preferred Stack: Python (Playwright or Scrapy with Playwright integration) is preferred for handling dynamic JavaScript content and complex DOM interactions at scale. Output Format: Structured data exported to CSV or JSON (or direct database dump), with images organized in a mirrored folder structure. Documentation: A concise README detailing how to update search parameters and selectors when the site layout changes. Reporting: An automated runtime log summarizing total pages scanned, items successfully saved, and any skipped/failed URLs. Acceptance Criteria Completeness: Must capture at least 95% of the total product catalog (~250k items), specifically targeting "hidden" or non-categorized items. Accuracy: Zero tolerance for watermarked images; the scraper must retrieve the original source assets. Autonomous Operation: The script must run without manual intervention once configured, handling credential/proxy rotation automatically. Resilience: Must handle dynamic content loading and potential rate-limiting effectively across large-scale sessions. I am looking for a developer who understands how to navigate complex e-commerce architectures where search-depth is the primary bottleneck. Please outline your approach for bypassing the category-based limitations and how you plan to manage the memory and storage requirements for a dataset of this size. Sitemap/Crawler Traversal: Identifying and parsing sitemaps or dynamically traversing product IDs to ensure 100% coverage, as standard category crawling will miss a significant portion of the catalog. Dynamic Anti-Bot Handling: Implementing rotating residential proxies, user-agent randomization, and request throttling to avoid detection while maintaining high-speed extraction. Scope of Data Text Data: Product name, complete technical specifications/descriptions, current price, and stock status. Images: Primary product images saved locally, ensuring high-quality, watermark-free versions. Metadata: SKU, part compatibility (if available), and source URL. Technical Stack & Requirements Preferred Stack: Python (Playwright or Scrapy with Playwright integration) is preferred for handling dynamic JavaScript content and complex DOM interactions. Output Format: Clean CSV or JSON, with images organized in a mirrored folder structure. Documentation: A concise README detailing how to update search parameters and selectors when the site layout changes. Reporting: An automated runtime log summarizing total pages scanned, items successfully saved, and any skipped/failed URLs. Acceptance Criteria Completeness: Must capture at least 95% of the total product catalog, ensuring that "hidden" or non-categorized items are included. Accuracy: Zero tolerance for watermarked images; the scraper must retrieve the original source assets. Autonomous Operation: The script must run without manual intervention once configured, handling credential/proxy rotation automatically. Resilience: Must handle dynamic content loading and potential rate-limiting effectively. I am looking for a developer who understands how to navigate complex e-commerce architectures where search-depth is the primary bottleneck. Please outline your approach for bypassing the category-based limitations and how you plan to simulate the search-index traversal. Follow-up question: To better tailor the technical implementation, do you have a list of core automotive categories or specific part-number prefixes that you would like prioritized during the initial search-traversal phase?
Project ID: 40475466
183 proposals
Open for bidding
Remote project
Active 7 hours ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
183 freelancers are bidding on average $380 USD for this job

Hi, this looks straightforward at first, but in my experience there’s usually a key detail that can cause issues later. I’ve handled similar projects before and can outline a practical approach for you. For similar work and case studies, feel free to check my profile: https://www.freelancer.com/u/microlent Let me know if you'd like me to walk you through the plan. ~ Rajesh
$500 USD in 7 days
9.4
9.4

Hi — Elias here from Miami. I understand you're looking for a reliable and automated digital product scraper. The goal is to ensure effective data extraction from various websites with different structures. What usually matters most here is the system’s ability to scale and adapt to changes on the target sites. A common issue in scrapers is managing the intricacies of data formats and handling anti-scraping measures. The tricky part is usually maintaining reliability over time, especially if the data sources change. My approach would involve using Scrapy to build a modular architecture that emphasizes data extraction efficiency while providing clear pathways for future enhancements. I focus on implementing error handling and logging to simplify troubleshooting, ensuring that the scraper can adapt as needed without extensive rewrites. I've developed similar scraping tools that interact with APIs and handle large volumes of data, which has given me insight into potential pitfalls and best practices. A few questions to better understand the scope: Q1 – What specific data points are you aiming to extract? Q2 – Are there any websites you anticipate will be more challenging? Q3 – How critical is real-time data updates, or is periodic scraping sufficient? Happy to go through the details and suggest the best technical approach. Looking forward to hearing from you.
$300 USD in 3 days
8.3
8.3

⭐⭐⭐⭐⭐ Create a Reliable Scraping Solution for Automotive Parts Inventory ❇️ Hi My Friend, I hope you are doing well. I reviewed your project requirements and see you are looking for a robust scraping solution for automotive parts. Look no further; Zohaib is here to help you! My team has successfully completed over 50 similar projects for data scraping. I will create a reliable system that efficiently extracts the entire catalog, ensuring high-quality data within your budget. ➡️ Why Me? I can easily handle your scraping project as I have 5 years of experience in data extraction, specializing in complex e-commerce structures. My expertise includes Python programming, web scraping, and database management. Additionally, I have a strong grip on managing dynamic content and implementing anti-bot measures. ➡️ Let's have a quick chat to discuss your project in detail. I can show you samples of my previous work and how I can deliver results that meet your needs. Looking forward to chatting with you! ➡️ Skills & Experience: ✅ Python Programming ✅ Web Scraping ✅ Data Extraction ✅ Database Management ✅ Anti-Bot Techniques ✅ CSV/JSON Output ✅ Search Simulation ✅ Performance Optimization ✅ Error Handling ✅ Project Documentation ✅ Runtime Logging ✅ Dynamic Content Handling Waiting for your response! Best Regards, Zohaib
$350 USD in 2 days
8.1
8.1

I can build a scalable Python scraper using Playwright + Scrapy with persistent storage (SQLite/PostgreSQL) to handle the large 250k+ inventory efficiently. Instead of relying on categories, I’ll use search-index traversal, sitemap discovery, keyword/part-number permutations, and direct product URL harvesting to uncover hidden items while rotating proxies, throttling requests, and handling anti-bot protections automatically. The solution will export structured CSV/JSON datasets with organized high-resolution images, runtime logs, duplicate prevention, and a clear README for reruns and selector updates. I also understand the budget constraint and can focus on delivering the most reliable coverage possible within the $300 cap.
$300 USD in 7 days
7.2
7.2

As a seasoned specialist in customized Python web automation and data mining, I have the technical stack and experience necessary to undertake this significant project. Throughout my 13+ years in the field, I've regularly tackled complex e-commerce structures and efficiently extracted data even when navigation depth is an obstacle. My strategy involves traversing product IDs dynamically or identifying and parsing sitemaps, ensuring full catalog coverage and mitigating against the limitations of category-based scraping. To address the nuanced challenges of this project, I propose employing rotating residential proxies, randomized user-agents, and smart request throttling for correct anti-bot handling. Moreover, not only will I provide your desired structured dataset in CSV, JSON, or database dump format with organized image folders, but I'll submit a detailed README documenting how to update search parameters and selectors if site layouts change. My commitment to resilience and autonomous operation means that the scraper I develop will run consistently without manual intervention once initially configured. So let's collaborate—I’m ready to leverage my skills to deliver comprehensive, accurate results for your automotive parts inventory project.
$250 USD in 1 day
7.3
7.3

With over 10 years of experience as a full-stack developer, I offer a comprehensive set of skills tailor-made for your project. Having extensively worked with large-scale databases like MongoDB, I am well-versed in tackling memory management and storage challenges for data of this magnitude. Incorporating Python (Playwright or Scrapy with Playwright integration) is the perfect match for handling the dynamic and complex DOM interactions that your project demands. I am proficient in extracting structured data and storing it through various formats like CSV or JSON, ensuring easy access and analysis as per your requirements. In addition to my technical expertise, my team and I understand the complex nuances involved in navigating e-commerce architectures where search-depth is crucial. Our approach primarily focuses on bypassing limitations through inventive means such as sitemap/crawler traversal and dynamically traversing product IDs to ensure comprehensive coverage. Moreover, we emphasize dynamic anti-bot handling strategies including rotating residential proxies, user-agent randomization, and request throttling to maintain high-speed extraction while minimizing the risk of detection.
$250 USD in 7 days
7.0
7.0

Hello, With 4 years of experience in PHP, Automation, MongoDB, and Software Architecture, I am well-equipped to tackle your project. I understand the requirement for a robust scraping solution to extract 250,000 automotive parts from ulti.cl. My approach involves implementing search-query simulation and site-wide index traversal to ensure comprehensive data extraction. I have carefully reviewed your project description and am confident in delivering a clean, structured dataset for analytical purposes. My expertise in PHP, Python, Web Scraping, Software Architecture, MongoDB, and Data Extraction aligns well with the technical stack and requirements outlined in the job description. I am eager to discuss further details and explore how we can achieve the project goals together. Please connect with me for a detailed conversation. Best regards, Taimoor from Pixels Soft Feel free to connect in chat for further discussion.
$500 USD in 7 days
6.6
6.6

Hello! I am a US-based senior software engineer with extensive experience in web scraping, data extraction, and automation. I carefully read your project description and I’m excited about the opportunity to build a robust and reliable digital product scraper for you. With over 15 years in the industry, I have honed my skills in PHP and Python, utilizing frameworks like Scrapy for efficient data collection. My goal is to ensure that your scraper not only meets your requirements but also operates seamlessly and efficiently. To clarify and better understand your needs, could you please clarify the following questions? 1. What specific websites or platforms do you intend to scrape data from? 2. Do you have any preferred data formats for the output (e.g., JSON, CSV)? 3. Are there any particular challenges or issues you've faced in previous scraping attempts that I should be aware of? I’m committed to delivering a solution that exceeds your expectations. After understanding your requirements, I would suggest planning the project in phases: initial requirements gathering, development, testing, and deployment. I look forward to the possibility of working together and creating a solution tailored to your needs. Best, James Zappi
$300 USD in 3 days
6.4
6.4

Warm greetings! I specialize in large-scale web scraping and data extraction for complex e-commerce systems. With 9+ years in Python, Scrapy, Playwright, MongoDB, and automation, I build resilient crawlers for high-volume product catalogs. Here's how I can help: * Hybrid sitemap + search + ID-based crawling for full coverage * Playwright/Scrapy for JS rendering and deep extraction * MongoDB/PostgreSQL storage with deduplication at scale * Throttling, session control, and proxy management for stability * CSV/JSON export + automated image download structure Before starting, do you have sample URLs, sitemap links, or login access? Also which DB do you prefer: MongoDB or PostgreSQL?
$500 USD in 7 days
6.2
6.2

https://www.freelancer.com/projects/data-scraping/Automated-Counterfeit-Detection/reviews Dear. Nice to meet you. I am very pleasure to submit my proposal on your scrapping and automation project. I have many experiences in these field using python. Recently, I developed Automated Counterfeit Detection and Reporting System on Amazon. You can check this in my portfolio. I am sure and I can start immediately. I will wait for your good news. Thank you.
$250 USD in 5 days
5.8
5.8

To capture the full catalog despite missing categories, I would build a search crawler that cycles through a broad set of automotive keywords and part-number patterns. This will simulate user queries across all possible product segments, ensuring “hidden” items surface. I’d include heuristics to expand the keyword set dynamically if new patterns appear during crawling. For storage, I’d use PostgreSQL to handle the 250k+ products efficiently, storing text data and metadata with relations to image files saved locally in a mirrored folder structure. This avoids memory issues and supports incremental updates. To keep the scraper running smoothly, I’d integrate rotating residential proxies with randomized user agents and built-in throttling to prevent blocking from the site’s anti-bot measures. Playwright would handle dynamic JS loading and help extract watermark-free images by grabbing the original source URLs rather than screenshots. The scraper will produce a CSV/JSON export alongside an automated runtime log covering pages scanned, items saved, and failures. I will provide a simple README explaining how to update selectors and search parameters if the site structure changes. Two points to confirm: 1) Do you have a core list of automotive categories or part prefixes to prioritize early in search simulations? 2) Should the scraper handle manual login or credential rotation if the site requires authentication? I can start immediately with a focused plan to build a reliable, high-coverage extraction tool that runs autonomously once set up.
$750 USD in 7 days
5.9
5.9

Hello, I'm Karthik, a Full-Stack Developer with 15+ years of experience in large-scale web scraping, data extraction, automation, and Python development. I understand the challenge here is not category crawling but achieving maximum catalog coverage through search-index traversal and product discovery. My approach would be: ✔ Analyze search endpoints, sitemap structures, APIs, and product URL patterns ✔ Implement Playwright + Python for dynamic content handling ✔ Use search-term permutations, part-number traversal, and sitemap discovery to uncover hidden products ✔ Store results incrementally in PostgreSQL/SQLite to avoid memory bottlenecks with 250k+ records ✔ Download original product images and maintain structured local storage ✔ Generate CSV/JSON exports, runtime logs, and documentation I have experience building resilient scrapers with proxy support, retry mechanisms, rate-limit handling, and automated recovery for long-running extraction jobs. One important note: achieving verified 95%+ coverage depends on the site's actual search/index architecture and accessibility. I would first perform a discovery phase to identify the most efficient traversal strategy before full extraction. I can work within your stated budget and provide a maintainable, well-documented solution. Best regards, Karthik 15+ Years Experience | Python | Playwright | Scrapy | Data Extraction
$750 USD in 7 days
5.8
5.8

Hi, I understand the challenge here is coverage, not just scraping. For a catalog this large and partially hidden from category navigation, I would avoid relying on menu traversal entirely. My approach would combine: - Search-query simulation using keyword batches, part-number patterns, and indexed search permutations - Sitemap and endpoint discovery to uncover hidden/non-linked products - Persistent storage with SQLite/PostgreSQL for checkpointing and deduplication - Playwright + async scraping for JavaScript-rendered pages - Rotating proxies, randomized headers, adaptive throttling, and retry queues for resilience The scraper would collect: - Product name - SKU / compatibility data - Descriptions/specs - Pricing and stock - Source URLs - Original high-resolution images Deliverables would include: - Structured CSV/JSON export - Local image archive - Runtime logs and failed-URL reporting - Modular Python code with README and rerun instructions I’m comfortable working within the stated budget and focusing on reliability, coverage, and autonomous execution rather than a quick one-off crawl.
$250 USD in 3 days
5.5
5.5

Hi, I have experience handling large-scale data collection from complex websites where standard category navigation is incomplete or unreliable. I can extract and organize high-volume product data into a clean, structured Excel/CSV file, including product details, pricing, stock status, metadata, source URLs, and images if required. I’m comfortable working with dynamic websites, large datasets, and anti-bot limitations while keeping data accurate and deduplicated. Before starting, I would first review the site structure to choose the most effective approach and maximize catalog coverage. I’m available to start immediately.
$300 USD in 7 days
5.2
5.2

⭐ If you award me, your smile shows up ⭐ Hi , Your project immediately stood out to me—it closely matches work I’ve completed successfully in the recent past. The core challenges, structure, and technical requirements are very familiar, with only a few unique elements that align perfectly with my expertise. This is great news for you: it allows me to skip the usual ramp-up time, avoid trial-and-error, and deliver clean, high-quality results quickly and confidently. I bring hands-on experience with Data Analysis, MongoDB, Data Extraction, PHP, Scrapy, Automation, Database Management, Software Architecture, Web Scraping and Python, along with proven workflows and best practices refined through multiple similar projects. You can view a directly relevant example in my portfolio here: https://www.freelancer.com/u/thomasb726 I’d be happy to discuss your specific goals in more detail and share tailored ideas based on what has worked best in comparable scenarios. Why clients choose—and continue working with—me: • Clear, proactive communication so you always know where the project stands • Strong respect for your deadlines, budget, and business reputation • Responsive, approachable, and focused on a smooth, stress-free process • Reliable post-delivery support that often leads to long-term partnerships If you’re looking for precise execution, high-quality results, and a dependable long-term partner, I’d love to connect and help bring your project to life. Best regards, Tom
$500 USD in 2 days
4.9
4.9

Hi there, As a highly experienced freelancer in the fields of automation and data extraction, I am particularly adept at navigating the complexities that e-commerce architectures often present. My proficiency in Python and its libraries, including Scrapy with Playwright integration lend themselves well to overcoming obstacles such as search-depth limitations and dynamic content loading. Through employing techniques like sitemap/crawler traversal, search query simulation, and utilizing rotating residential proxies, as well as user-agent randomization and request throttling I not only ensure a high-speed extraction but also maintain security from anti-bot protocols. Furthermore, I understand both the importance of the scale of your project and how it affects issues like memory bottlenecks. That's why my choice of utilizing persistent storage via SQLite or PostgreSQL aligns with your requirement for an architecture that can handle the large volume without sacrificing productivity. My experience in handling similar projects at scale also allows me to confidently assure you of my proficiency in smoothly managing memory and storage.
$500 USD in 3 days
5.0
5.0

Hey there, I'm Vishal Maharaj, a seasoned professional with 25 years of experience in PHP, Python, Software Architecture, Automation, MongoDB, and Web Scraping based in Perth, Australia. I am keen to take on the challenge of developing an automated scraping solution for extracting a comprehensive inventory of automotive parts from ulti.cl. My approach would involve implementing search-query simulation and site-wide index traversal to overcome the site's complex structure and ensure a clean, well-structured dataset for analytical purposes. If you're interested in discussing this project further, feel free to initiate the chat. Cheers, Vishal Maharaj
$500 USD in 5 days
5.1
5.1

Hi, I do have some questions, but here's what I can do for you: - Build a search-driven crawler that systematically enumerates the full catalog, so no hidden or uncategorized parts slip through - Set up persistent storage with PostgreSQL from the start, keeping memory usage flat even at 250k+ records so the run stays stable - Implement rotating proxies, randomized fingerprints, and smart throttling so the scraper keeps running uninterrupted across long sessions - Deliver clean structured output with original images, full metadata, and a runtime dashboard logging every scan, save, and skip in real time - Provide a clear README so you can tweak selectors and search parameters yourself when the site changes I'll handle the backend with Node.js using modular or microservice architecture and clean code standards for rock-solid reliability and easy updates. Note: full source code will be delivered. Note: you won't pay until the work is done Send me a message, let's discuss.
$299 USD in 3 days
4.9
4.9

Hi, I can handle this project and build a scalable scraping pipeline for extracting the full inventory from ulti.cl. My approach would first focus on analyzing the backend traffic and internal search behavior to determine whether the data can be accessed directly through hidden APIs, search endpoints, or indexed product requests instead of relying on incomplete category navigation. I’ve done this many times on large-scale scraping projects involving millions of records. For the scraper itself, I can use Python with Requests-BeautifulSoup for high-speed extraction, and integrate Seleniumonly where JavaScript rendering is necessary(Selenium is much slower than requests for sure, but the only solution for JS rendering). The architecture would support persistent storage (SQLite/PostgreSQL), retry handling, logging, proxy rotation, throttling, and resumable execution for stability over long scraping sessions. The scraper will export structured CSV/JSON data, download original product images, and include runtime reports plus a concise README for future maintenance. Looking forward to discussing the details.
$250 USD in 5 days
5.0
5.0

Nice to meet you ,The requirements of your project match my areas of work and skills, to introduce myself. My name is Anthony Muñoz and i am the lead engineer for DS Pro IT agency. I have worked for over 10 years as a Full-Stack and software development engineer and have successfully done multiple jobs. It will be a pleasure to work together to make your project. Feel free to discuss about the project with me, greetings.
$774 USD in 7 days
4.9
4.9

Miami, United States
Payment method verified
Member since Apr 29, 2021
$30-250 USD
$10-30 USD
$250-750 USD
$30-250 USD
$10-30 USD
₹600-1500 INR
₹600-1500 INR
£20-250 GBP
₹12500-37500 INR
$30-250 USD
₹1500-12500 INR
$30-250 USD
$30-250 USD
₹12500-37500 INR
$30-250 AUD
$25-50 USD / hour
£18-36 GBP / hour
₹1500-12500 INR
$1500-3000 USD
₹12500-37500 INR
₹100-400 INR / hour
min $50 USD / hour
$8-15 USD / hour
$750-1500 USD
$10-30 USD