Filter

My recent searches
Filter by:
Budget
to
to
to
Type
Skills
Languages
    Job State
    3,605 scrapy jobs found

    I’m building a fresh prospect list and need someone who can automatically pull contact information from a set of business websites I’ll supply. The goal is simple: turn each URL into clean, usable leads that include company name, contact person (if available), email, phone, and any publicly listed address. You’re free to employ Python with BeautifulSoup, Scrapy, Selenium or a comparable stack—the only requirement is that the solution runs reliably, respects reasonable scraping etiquette, and can be rerun whenever I add new domains. I’ll provide the starting list of sites plus a short test batch so we can verify everything is being captured correctly. Deliverables • A working script (with clear setup notes) or a lightweight app that runs on ...

    $44 Average bid
    $44 Avg Bid
    33 bids

    I need a scraper for a specific website. Around 800 products. The data should be exported to a spreadsheet and should include: - Product name and description - Price and availability - Detailed product specifications (very important) - Product ima...exported to a spreadsheet and should include: - Product name and description - Price and availability - Detailed product specifications (very important) - Product image URLs - Product configuration options (i.e. additional drop-down variants) - Product URL The scraper should be able to be re-run monthly. Ideal skills and experience: - Experience with web scraping tools (e.g., BeautifulSoup, Scrapy, Selenium) - Proficient in data handling and Excel - Familiarity with scraping e-commerce websites - Ability to set up automated scraping...

    $16 / hr Average bid
    $16 / hr Avg Bid
    101 bids

    ...just doesn’t offer an easy export option, so I’m after an automated scrape rather than manual copy-paste. For each doctor I only require three fields: full name, suburb and post code. No extra columns are necessary. Once gathered, please deliver everything in a clean, de-duplicated Excel workbook; one row per doctor with clear column headers. A robust script written in Python with BeautifulSoup, Scrapy, or a similar library is perfect, provided it respects the site’s and rate limits so we avoid any blocking. I’m happy to receive the script alongside the spreadsheet so the data can be refreshed later, but the primary deliverable is the finished Excel file. Accuracy matters more than speed: the list should be as comprehensive as the source allows, free o...

    $91 Average bid
    $91 Avg Bid
    113 bids

    I'm looking for an experienced web scraper to extract detailed data for a business directory. The scrap...scraper to extract detailed data for a business directory. The scraped data will be used for market analysis and business development. Required Data: - Contact Information - Business Ratings - Service Details - Customer Reviews - Images - Links to Website - Social Media Links -any other details Ideal Skills and Experience: - Proficiency in web scraping tools and technologies (e.g., Beautiful Soup, Scrapy, Selenium) - Experience with data cleaning and processing - Ability to deliver data in a structured format (e.g., CSV, JSON) - Attention to detail and accuracy - Prior experience with similar projects is a plus Please provide examples of previous work and estimated time ...

    $338 Average bid
    $338 Avg Bid
    147 bids
    Scalable Indeed Job Scraper
    3 days left
    Verified

    ...have identified—job title, company name, official company website, full description, salary, skills, experience, location, apply URL and posted date—while de-duplicating anything already stored. Of those, job title and company name must never be missed because they anchor the rest of my pipeline. I would like the core built in Python and I am comfortable if you reach for Playwright, Puppeteer, Scrapy or Selenium so long as the codebase stays clean and readable. Results should flow straight into either PostgreSQL or MongoDB; future scale is important, so please structure tables/collections with growth in mind. Indeed can be unforgiving, so the scraper has to rotate proxies by default. Feel free to layer on further anti-bot tactics (stealth headless settings, adaptive...

    $170 Average bid
    $170 Avg Bid
    56 bids

    ...unavailable. The goal is a seamless pipeline that collects, normalises and stores the data so my own code can display it in real time. Here is the workflow I have in mind: • API layer – connect to any available REST or GraphQL endpoints, handle authentication keys or tokens, manage rate limits, and return clean JSON. • Scraper fallback – when no API exists, run a headless scraper (Python + Scrapy/BeautifulSoup or comparable stack) with proxy rotation and basic anti-bot evasion, then map the results to the same data structure used by the API calls. • Data handling – write to my existing MySQL database, flag updates, and trigger my site’s content refresh. • Automation & monitoring – schedule jobs vi...

    $298 Average bid
    $298 Avg Bid
    141 bids

    ...store the collected information in a clean CSV. I want restaurant name, address and email in three distinct columns, followed by the extra fields—phone, opening date and owner—so six columns in total. Please include meaningful error-handling for captchas or downtime, and log any skipped entries so I can review them later. Deliverables • Fully commented source code for the scraper (Python with Scrapy, Selenium, or another robust framework). • One sample CSV showing at least a few live entries from different states in the required column order. • A brief README explaining setup, required libraries, and how to schedule future runs. I will consider the project complete once the script reliably pulls fresh pre-opening data from every state’s DB...

    $504 Average bid
    Featured
    $504 Avg Bid
    130 bids
    Daily JSON Data Scraping
    1 day left
    Verified

    I run several ongoing web-data initiatives and need a reliable partner to handle the scraping side. Every day you will extract fresh data from a rotating set of target sites and deliver it as well-structured JSON. The workflow should be fully automated (Python, Scrapy, BeautifulSoup, Selenium, or similar) and resilient to layout tweaks, CAPTCHAs, and IP blocks. I’ll provide the list of URLs and the specific fields required for each scrape; you’ll return parsed JSON plus a quick success log so I can drop the data straight into our pipelines. Acceptance criteria • Script or spider runs unattended on a schedule I can trigger via cron or similar • Returned JSON matches the field names and hierarchy I specify for each site • Error handling with retry lo...

    $12 Average bid
    $12 Avg Bid
    46 bids

    I need an automation script to scrape product information. Requirements: - The script should efficiently gather product data from specified sources. - It should be reliable and handle potential changes in website structures. - Data should be organized and exported in a specified...- The script should efficiently gather product data from specified sources. - It should be reliable and handle potential changes in website structures. - Data should be organized and exported in a specified format (e.g., CSV, JSON). Ideal Skills and Experience: - Proficiency in scripting languages (e.g., Python, JavaScript). - Experience with web scraping tools/libraries (e.g., Beautiful Soup, Scrapy). - Strong understanding of handling data and APIs. - Problem-solving skills to manage and debug potenti...

    $10 / hr Average bid
    $10 / hr Avg Bid
    26 bids

    ...captures every internal page, pulls out all text and every embedded image, and records the direct URLs discovered along the way. Rather than a flat file hand-off, the collected material should be transferred into another website I specify, maintaining the original page structure so that links, headings, and image references remain intact. Please handle the job with a reliable stack such as Python, Scrapy or BeautifulSoup for the crawl, and use whatever bridge—REST, direct database import, or CMS API—you find most efficient for moving the scraped data into the destination site. Deliverables • A reproducible script or crawler project. • Full text and image assets migrated to the destination site, preserving hierarchy and links. • A mapping file (URL...

    $142 Average bid
    $142 Avg Bid
    212 bids

    ...running follow-up tasks (such as data cleaning, scheduling, or routing to downstream APIs) without manual intervention. Python will power the scraping layer—think requests, BeautifulSoup, Scrapy, or Selenium where dynamic content demands it. Java sits behind the scenes for orchestration and integration with existing services; Spring Boot is already in place, so wiring new endpoints or scheduled jobs into that stack should feel natural to you. Key deliverables • Scrapers that reliably collect product titles, prices, SKUs, images, and stock status, delivered as modular Python scripts or a Scrapy project. • A Java-based automation module that picks up the scraped JSON/CSV output, persists it (PostgreSQL is preferred), and triggers any post-processing...

    $21 / hr Average bid
    $21 / hr Avg Bid
    114 bids

    ...painful—modules or clear functions will be appreciated, though it doesn’t have to be a full-blown framework. The directory requires no login, but it does paginate the listings. Your solution must handle that pagination automatically, avoid triggering rate-limits, and finish with a CSV that has at least two columns: CompanyName and Email. I’m happy with Python (requests, BeautifulSoup, Selenium, Scrapy—whatever suits the job best) or another language you strongly recommend, as long as setup instructions are straightforward and cross-platform. Deliverables: • Source code with comments • A sample CSV generated from a short run • A brief README with setup and execution steps If anything about the target site complicates scraping—...

    $115 Average bid
    $115 Avg Bid
    105 bids

    ...they can be stored or linked from the SQL tables • Insert or update rows in the database without creating duplicates, using an upsert strategy I am comfortable provisioning a staging table and sharing schema details, but I expect you to handle the connection logic, error handling, and logging so I can drop the task into a daily cron job the moment it is delivered. Python with BeautifulSoup, Scrapy, or Selenium is preferred, yet I am open to any language that can demonstrate reliability and future maintainability. Please deliver well-commented source code, a brief README with setup steps, and a sample SQL dump or migration file so I can verify the data structure before we move to production....

    $68 Average bid
    $68 Avg Bid
    42 bids

    ...or reliable associated sources. Specific sources: Euromillones: (since Feb 13, 2004) La Primitiva: (since Oct 17, 1985 – modern version) El Gordo de la Primitiva: (since Oct 31, 1993) Updates automatic at exactly 00:02 the day after each draw, using ethical scraping (BeautifulSoup/Scrapy) with proper user-agent headers to mimic human behavior. Store data in PostgreSQL (structured) or MongoDB (flexible), including all prize categories to enable ROI calculations and backtesting. 2.2. Number Prediction Generate predictions for Euromillones, La Primitiva and/or El Gordo simultaneously using explicit advanced AI models: Machine Learning ensembles (Random Forests) for frequency/statistical

    $1388 Average bid
    $1388 Avg Bid
    46 bids

    ...plug straight into that environment while letting me: Automate outreach & follow-ups • Build reusable email templates, marketing sequences and SMS triggers that fire automatically from Zoho when a lead is added, moved or closed. • Track every send, open and click so the CRM timeline stays complete and reporting is effortless. Gather high-value prospect data • Deploy web scrapers (Python, Scrapy, Selenium or a stack you recommend) to pull fresh contact details, direct mobile numbers and behaviour indicators from public sites and niche industry sources. • Clean, deduplicate and enrich the data before it reaches Zoho, keeping everything GDPR compliant. Acceptance criteria • Inside Zoho I can press one button and watch the full email/SMS ...

    $17 / hr Average bid
    $17 / hr Avg Bid
    127 bids

    ...happening again?" Do not send a generic proposal. If you cannot answer the question above specifically, please do not bid. Required skills : Playwright (Python) Python async / asyncio Celery workers Docker / Docker Compose Session & retry handling iframe-heavy portals Nice to have: FastAPI MongoDB / Redis Celery Beat Linux server ops Do not apply if: Your only automation tool is Selenium or Scrapy You've never worked on a production system Your stack is PHP / Laravel / WordPress ...

    $7 / hr Average bid
    $7 / hr Avg Bid
    30 bids

    ...for downstream analytics Implement delta scraping to capture only new content efficiently Integrate with Apify Actors or custom Node.js/Python scripts Maintain reliability as platforms evolve or change their anti-bot systems Required Skills: Strong experience with Python or Node.js Expertise in browser automation frameworks such as Playwright, Puppeteer, or Selenium Experience with Apify, Scrapy, or similar scraping frameworks Knowledge of proxy rotation, session management, and anti-detection techniques Ability to design scalable, maintainable scraping pipelines Familiarity with REST APIs, OAuth, and data normalization Experience with MongoDB, PostgreSQL, or cloud storage solutions Preferred Experience: Prior work scraping platforms with strong anti-bot protections ...

    $3698 Average bid
    $3698 Avg Bid
    203 bids

    ...stops or captchas blocking the flow. Once gathered, the data should be written to a single CSV file, overwriting the previous file on each run so I always have a fresh snapshot. The entire process has to trigger automatically every 24 hours (cron, systemd timer, Cloud Scheduler—whatever you prefer) and run headless on a Linux VPS I’ll provide. I am fine with Python (requests, BeautifulSoup, Scrapy, Selenium), Node with Puppeteer, or another solid stack as long as setup is straightforward. Deliverables • Source code with clear README covering setup, environment variables, and scheduling steps • One-time deployment assistance on my VPS • Proof of a successful unattended daily run (sample CSV + log) Acceptance criteria: a full CSV containing...

    $147 Average bid
    $147 Avg Bid
    140 bids

    ...text in CSV or JSON and store images in clearly named folders that map back to the records. • Preserve basic structure—so each text record includes the image file name or path. • Respect and rate limits; the scrape must be discreet and repeatable. What I’d like to see in your proposal Please outline your end-to-end approach: preferred language or framework (e.g. Python with Scrapy/BeautifulSoup, Selenium for dynamic pages, or another stack you trust), handling of pagination/login barriers, deduplication strategy, and estimated turnaround time. A brief sample architecture diagram or code snippet showing how you handle image downloads would be a plus. Deliverables 1. Scraper script(s) with clear setup instructions. 2. Final datasets (CSV/JSON) a...

    $132 Average bid
    $132 Avg Bid
    118 bids

    ...projects. Hands-on experience with Scrapy, Selenium, Playwright, Requests, BeautifulSoup, or similar scraping frameworks. Basic to intermediate understanding of AI/LLM-powered automation workflows using ChatGPT, OpenAI APIs, Claude, Gemini, or LangChain. Experience handling dynamic websites, login sessions, cookies, browser automation, and structured/unstructured data extraction. Familiarity with APIs, JSON/XML handling, databases, automation scripting, Git, Docker, or Linux environments. Good analytical, debugging, and problem-solving skills with the ability to work in fast-paced environments. Responsibilities: Develop and maintain web scraping and browser automation scripts for extracting structured and unstructured web data. Build scraping workflows using Scrapy, Selen...

    $271 Average bid
    $271 Avg Bid
    19 bids

    ...once the project starts. The focus is strictly on text—no images or media files—so the routine should locate, extract, and save headings, paragraphs, or other copy in a clean, structured format such as CSV or JSON. Because the target sites may vary, the code must be written with easy-to-tweak selectors and robust error handling. Please use widely supported libraries (requests, BeautifulSoup, or Scrapy if you prefer) and include polite scraping practices such as user-agent rotation, rate limiting, and graceful retries on common HTTP errors. Deliverables: • A well-commented .py file ready to run from the command line • A brief README explaining any required packages, configuration steps, and how to point the script at a new site I will test the script ag...

    $134 Average bid
    $134 Avg Bid
    251 bids

    I need a reliable script that can harvest contact details from a set of company websites I will provide. The information I’m after is any publicly listed email address, phone number and physical address that appears on those sites. Please build the scraper in a way that I can rerun it easily—Python with BeautifulSoup, Scrapy or Selenium is perfect as long as the code is well-commented and the output is clean. I prefer a single CSV (or JSON if you feel it suits the data better) with one row per company and distinct columns for each field you capture. Accuracy and completeness matter more than sheer speed, so let me know how long you’ll need once you see the list of domains. If you have questions about structure or formatting, ask before you dive in.

    $11 Average bid
    $11 Avg Bid
    17 bids

    ...refrigerators, washing machines, microwaves) on individual sheets. Each sheet should list: – Product Name – Description (paragraph form, no HTML tags) – Average Rating (numeric) – Review Text (one row per review, repeat product name if necessary) Include a brief README sheet that explains any column codes or scraping limitations you encountered. Technical notes Feel free to use Python, Scrapy, BeautifulSoup, Selenium or a similar scraper—whatever you are fastest with—provided the final data is accurate, deduplicated and ready for pivot-table analysis. If you plan on using an API or paid proxy service, let me know in advance so I can approve the approach. Acceptance criteria 1. At least 300 unique home-appliance models covered. ...

    $12 Average bid
    $12 Avg Bid
    10 bids

    ...missing fields from the spec tables. • Images download without watermarks and match the model in both name and variant. • Script completes without manual intervention and respects polite crawl delays. If this sounds straightforward to you and you can turn it around quickly, let me know how soon you can have an initial dump ready and which language or libraries you prefer to use (BeautifulSoup, Scrapy, Puppeteer, Selenium, etc.)....

    $174 Average bid
    $174 Avg Bid
    78 bids

    ...into a Google Sheets workbook. The pipeline must • survive Amazon’s throttling, bot checks, and page format changes without manual babysitting, • finish a full run on 10 k titles in a single session without crashing or silently skipping rows, and • give me fields that are already matched and normalised so downstream staff can link them to our catalogue instantly. Architecture is up to you: Scrapy, Playwright, headless Chrome, rotating residential proxies, Selenium, or a custom HTTP solution—whichever mix keeps the request footprint human-like and maximises up-time. What matters is that the codebase is clean, well-documented, and easy for an internal engineer to extend later. Deliverables 1. Fully annotated Python source (PEP 8 compliant) pack...

    $9 / hr Average bid
    $9 / hr Avg Bid
    10 bids

    ...into a Google Sheets workbook. The pipeline must • survive Amazon’s throttling, bot checks, and page format changes without manual babysitting, • finish a full run on 10 k titles in a single session without crashing or silently skipping rows, and • give me fields that are already matched and normalised so downstream staff can link them to our catalogue instantly. Architecture is up to you: Scrapy, Playwright, headless Chrome, rotating residential proxies, Selenium, or a custom HTTP solution—whichever mix keeps the request footprint human-like and maximises up-time. What matters is that the codebase is clean, well-documented, and easy for an internal engineer to extend later. Deliverables 1. Fully annotated Python source (PEP 8 compliant) pack...

    $9 / hr Average bid
    $9 / hr Avg Bid
    25 bids

    I need all product information from a single website pulled into a spreadsheet-friendly file. The scrape must capture price, availability status, full product descriptions, and a direct link to each product image. This is a one-off extraction, not an ongoing job, so once the data set is complete and verified we are done. Please choose the tools you are most comfortable with—Python, Scrapy, BeautifulSoup, Selenium,webharvy or a comparable stack are all fine—as long as the final deliverable meets these acceptance points: • Every product listed on the target site is represented • Columns include: product name, price, availability, description, image URL, and the source page URL • File is delivered in .xlsx or .csv format, cleanly structured and ready to...

    $11 Average bid
    $11 Avg Bid
    10 bids

    I need a thorough, page-by-page scrape and front-end teardown of a commercial astrology website. The goal is to capture every visible element—from horoscope generators and pricing tables to sign-up flows, how and where each link and each button takes user/...and modal on the domain is represented in the site map. • Screenshots clearly demonstrate workflows such as account creation, horoscope lookup, and checkout. • No automated requests trigger CAPTCHA or block—rate limiting must be respected. • Final data is clean, deduplicated, and organised by section (home, user account, horoscope, shop, etc.). If you have experience with tools like Python-Scrapy, BeautifulSoup, or Playwright for dynamic sites and can package findings into a concise, insightful ...

    $26 Average bid
    $26 Avg Bid
    19 bids

    ...into a Google Sheets workbook. The pipeline must • survive Amazon’s throttling, bot checks, and page format changes without manual babysitting, • finish a full run on 10 k titles in a single session without crashing or silently skipping rows, and • give me fields that are already matched and normalised so downstream staff can link them to our catalogue instantly. Architecture is up to you: Scrapy, Playwright, headless Chrome, rotating residential proxies, Selenium, or a custom HTTP solution—whichever mix keeps the request footprint human-like and maximises up-time. What matters is that the codebase is clean, well-documented, and easy for an internal engineer to extend later. Deliverables 1. Fully annotated Python source (PEP 8 compliant) pack...

    $87 Average bid
    $87 Avg Bid
    75 bids

    I’m compiling a comprehensive list of Australian-based IT companies and service providers. For each company I need two things captured accurately: up-to-date contact details (website, email, phone, physical address where available) and a clear summary of the services they offer. ...expand well beyond those. Acceptance criteria • Minimum record count and geographic coverage agreed up front. • Each row includes company name, contact fields, and service list, with no blank columns unless the info genuinely doesn’t exist online. • Duplicates removed and obvious errors validated manually or by script. Please outline your preferred scraping stack (Python, BeautifulSoup, Scrapy, Selenium, etc.), estimated turnaround, and a sample of past similar work...

    $285 Average bid
    $285 Avg Bid
    89 bids

    ...looking to purchase property. All I need for each lead is clean, current contact information (name, working email, and phone number where available). No property preferences or past purchase history are required at this stage. Please pull from reputable, publicly accessible sources only and respect site terms of service while harvesting data. I’m happy to discuss your preferred stack—Python, Scrapy, BeautifulSoup, Apify, browser automation, or any combination that ensures speed and data integrity—and to agree on volume and turnaround once you confirm feasibility. Acceptance criteria: • A CSV or Google Sheets file containing unique buyer entries, each with at least one verified email. • Bounce rate under 5 % when emails are tested. • Clear notat...

    $593 Average bid
    $593 Avg Bid
    49 bids

    I need a reliable web-scraping setup that gathers product details from Australian e-commerce, retailer, and manufacturer websi...recognise duplicates, merge the records, and clearly show which site each data point came from. Require packet size, images, and exact names for different varities. Please build the scraper, run an initial full pull, and deliver: • A clean CSV formatted exactly to Shopify’s column requirements, with the fields above populated for every unique product. • The documented code or workflow (Python, Scrapy, BeautifulSoup, Selenium—use whatever you prefer) so I can rerun it when prices change. • A brief read-me explaining how to update the regional site list or add new sources. Accuracy and deduplication are critical; I will ...

    $17 Average bid
    $17 Avg Bid
    38 bids

    ...directories. At the moment I do not have a predefined list of sites, so I need your guidance on which sources are most likely to yield accurate, up-to-date contacts in the niches we decide on together. The scope is strictly phone numbers—emails or social profiles can remain optional for later phases. Please propose the platforms you would mine, outline the techniques or tools you prefer (Python, Scrapy, BeautifulSoup, Selenium, API access, etc.), and tell me how you will stay within each site’s terms of service. Deliverables • A clean CSV or Excel file containing: phone number, source URL, and any immediately available contextual data (name, company, location). • A brief methods note so I understand how the data was gathered and can replicate or upda...

    $129 Average bid
    $129 Avg Bid
    126 bids

    ...that need to be crawled and their data captured, cleaned, and delivered in a single, well-structured Excel workbook. The exact fields vary by site, but in most cases I’ll want core page information such as titles, visible text blocks, image URLs, and any clearly marked product or contact snippets that appear. You are free to use the stack you’re most comfortable with—Python + BeautifulSoup, Scrapy, or Selenium are all fine—as long as the final result arrives as an Excel file with tidy column headers and no duplicate rows. Deliverables • An executable script (or notebook) with inline comments so I can rerun the scrape later. • The completed Excel file containing all records pulled from the target sites. Acceptance criteria • Ev...

    $433 Average bid
    $433 Avg Bid
    177 bids

    ...sub-category path, all image URLs, plus every variant attribute such as colour, size or other options the site shows. The scraper should run on demand and be easy to schedule for weekly updates. Anti-bot countermeasures—rotating proxies, polite timing, and basic CAPTCHA handling—will be essential because some of these sites throttle traffic quickly. I’m comfortable with a Python stack, so tools such as Scrapy, BeautifulSoup, Selenium or Playwright are welcome as long as the final code is clean, well-commented and handed over in a private Git repository. Deliverables • • CSV or Excel export containing: product URL, title, description, category, sub-category, price, variant attributes, image links • Compressed folder (or cloud bucket) of downl...

    $23 Average bid
    $23 Avg Bid
    38 bids

    ...contact details from a specific website and receive everything neatly organised in a single Excel workbook. The site displays email addresses, phone numbers and mailing addresses on various pages; I’d like every one of those fields captured, de-duplicated and placed into clear columns so the file is ready for immediate use. Please choose whichever approach suits you best—Python with BeautifulSoup, Scrapy or Selenium are all fine—as long as the result is accurate and the scraping process respects the site’s loading behaviour. If any anti-bot measures appear, let me know your workaround before proceeding. Deliverables • An Excel file (.xlsx) containing all entries of each link • The runnable script or notebook you used, plus brief setup instruc...

    $130 Average bid
    $130 Avg Bid
    103 bids

    ...product information from a set of websites and refreshes that data every week, without manual effort on my side. The data points I care about most are SKU, product title, current price, availability, description and image URLs. Some of the pages load content dynamically, so the solution may need a head-less browser approach (Selenium, Playwright, or similar) in addition to classic libraries such as Scrapy or BeautifulSoup. The workflow I picture is straightforward: the crawler launches on a weekly schedule, pulls the latest product information, cleans and deduplicates the results, then exports everything to either CSV and/or JSON. A quick push to a Google Sheet or an S3 bucket afterward would be ideal so the file is immediately usable by the rest of my team. Deliverables &bul...

    $151 Average bid
    $151 Avg Bid
    73 bids

    ...the company’s own website. Deliverable per batch (500–1,000 lines): • Company name • Country • Service category (EPC, maintenance, logistics, etc.) • Website, phone and LinkedIn page where available • Source URL and date added All rows must be deduplicated and formatted consistently before delivery. The overall target is 10,000+ companies, so strong automation skills (Python, Apify, Scrapy, BeautifulSoup, etc.) and a scalable workflow are important. When you respond, include: • Your rate per 1,000 verified records • The tech stack you prefer • A brief example of similar extraction work you have done Short-listed candidates will complete a small test scrape so I can review data quality and process fit. Looki...

    $410 Average bid
    $410 Avg Bid
    130 bids

    Im looking for a large amount of lists and looking to pay around $1 per 20 contacts with valid email addresses harvested from lists I'll give you. I only neeed proper clean verified and contactable info! Objective: outreach to sell books. You might use the stack you are most comfortable with Python + BeautifulSoup, Scrapy, Selenium, Node.js + Puppeteer, or similar 'as long as the final data is clean and deduplicated' Deliverables: • A CSV file containing each email address alongside the exact page URL where it was found • A brief note on the toolchain or script used (for reproducibility) I need the lists to be well-targeted, deduplicated, and usable for outreach. Accuracy matters. Do Not give me a bloated list full of bounces... Are the contacts man...

    $96 Average bid
    $96 Avg Bid
    72 bids

    I need a clean, well-structured list of Colorado-based of peoples phone numbers for prospecting health-life insurance policies. Please source every number exclusively from publicly available web directories; social media or random website scraping is outside the scope...Phone number 2. Contact or business name (if listed) 3. City / ZIP 4. Source URL Acceptance Criteria – Minimum 95 % of numbers must connect to a real line when spot-checked. – No entries drawn from outside Colorado. – Every row must include the source URL. When you apply, tell me about similar scraping projects you have completed and the tools you prefer (Python, BeautifulSoup, Scrapy, Octoparse, etc.). I’m ready to start as soon as I find someone with proven experience gathe...

    $115 Average bid
    $115 Avg Bid
    67 bids

    ***** Please read the Word document withe full specs of the job ***** ***** Please read the Word document withe full specs of the job ***** ***** Please read the Word document withe...• Volume & speed: the full run (90k symbols) must finish inside the 4-6 hour window. • Output: one .xlsx file with rows for each symbol and either embedded images or file-path references to an accompanying images folder. • Stability: handle pagination, CAPTCHAs, rotating proxies, retries, and resume-from-last-point logging so a hiccup doesn’t force a restart. • Tech: I’m partial to Python—Scrapy, Playwright, or Selenium—but I’m open if you have a faster or more reliable stack. Please outline your proposed approach, main libraries, and any s...

    $472 Average bid
    $472 Avg Bid
    128 bids

    I need a reliable web-scraping bot that automatically pulls fresh content from a specific news site on a schedule I can adjust. At minimum the script should capture the headline and full article text; if author name, publication date, and embedded image URLs can be extracted too, that’s a welcomed bonus. Build it in Python using a well-supported stack such as Requests/BeautifulSoup, Scrapy, or Selenium—whatever you feel is most robust for handling pagination and occasional layout changes. The bot should: • Navigate through the latest articles section (and subsequent pages if present) • Respect and reasonable rate-limits • Output clean, de-duplicated data to CSV or JSON and optionally push to a simple SQLite file For acceptance I’ll run th...

    $13 / hr Average bid
    $13 / hr Avg Bid
    42 bids

    ...(set by postal code) and captures every product available across all categories. The scraper has to run once every 24 hours, overwrite data in a structured file (CSV or JSON—your call), and handle the usual road-blocks on large sites such as store-selection prompts, pagination, lazy-loaded content, captchas, and IP throttling. What I expect from you • A clean, well-commented Python solution—Scrapy, Selenium, Playwright or a comparable stack are all fine as long as it is headless and low-maintenance. • A simple config or .env so I can change the target postal code or output path without touching the code. • A scheduler (cron job, Windows Task Scheduler, or a lightweight cloud function) that calls the scraper daily. • One sample dataset gen...

    $26 Average bid
    $26 Avg Bid
    15 bids

    ...delays to avoid blocking • handle pagination and dynamic content (Selenium or similar only if JavaScript rendering is essential) • be easy for me to adjust—site URLs, HTML selectors, or output paths should sit in a single config file • log each run so I can trace errors or missing rows later Please write it in modern Python (3.10+), using popular libraries such as requests, BeautifulSoup, Scrapy, Selenium or Playwright—whatever best fits each target site—while keeping external dependencies minimal. Acceptance criteria • On execution, the script completes without errors and exports matched product records from at least one sample site I supply. • Code is commented clearly enough for a Python-literate user to extend to addition...

    $237 Average bid
    $237 Avg Bid
    67 bids

    ...descriptions, category labels, and any text-based specifications; I do not need images or pricing data. The scraper should: • Handle pagination, dynamic or lazy-loaded sections, and common anti-bot measures without overloading the servers. • Output clean, well-structured data (CSV or JSON preferred) ready for import into my internal system. • Be written in readable, well-commented code—Python with Scrapy, BeautifulSoup, or Selenium is ideal, but I’m open to equivalent approaches if they achieve the same reliability. • Include simple configuration so I can add or swap target domains later. • Respect directives and configurable request delays. Acceptance criteria 1. Running the script against a supplied test URL set returns all vi...

    $88 Average bid
    $88 Avg Bid
    45 bids

    I need a reliable solution that automatically pulls public-facing text from specific social-media websites and delivers the content back to me in a clean, structured format ready for downstream analysis. The task covers three steps: building or customising a scraper in Python that navigates the chosen platforms, capturing posts, comments and any asso...public-data policies. Acceptance will be based on: • Accurate capture of the requested text fields from the sample profile list I provide • Fewer than 1 % duplicate rows after cleaning • Script runs end-to-end on my machine with only standard Python libraries or clearly listed open-source dependencies If you already have experience scraping social platforms via Selenium, BeautifulSoup, Scrapy or similar tools, t...

    $80 Average bid
    $80 Avg Bid
    11 bids

    I'm seeking an experienced Python developer to help with data analysis and automate data scraping tasks. Key Requirements: - Develop Python scripts for data analysis. - Automate data scraping from websites. - Deliver clean, structured datasets for analysis. Ideal Skills: - Proficiency in Python, especially for data manipulation. - Experience with libraries like BeautifulSoup, Scrapy, or Selenium. - Strong background in data analysis, preferably with Pandas or NumPy. - Familiarity with data storage solutions (e.g., SQL, NoSQL). Looking for someone who can write efficient, reliable scripts and has a keen eye for detail.

    $544 Average bid
    $544 Avg Bid
    87 bids

    ...on each listing (titles, descriptions, prices, SKU codes, category paths and any other on-page text detail), then supply the data back to me in a spreadsheet-ready format such as CSV or XLSX. I do not need images—only the textual content. Any solution you craft must respect the site’s and avoid rate-limit issues so nothing gets blocked. A Python script using requests/BeautifulSoup, Scrapy, or a headless browser (Selenium, Playwright) is perfectly fine as long as your code is clearly commented and reusable; if you prefer another language, that works too provided the result meets the same standards. Deliverables: • Complete dataset of all product listings in CSV or XLSX • The scraping script with brief run instructions • One short report summarisin...

    $25 Average bid
    $25 Avg Bid
    45 bids

    I need a clean, one-time extraction of every registered agent listed on the Kerala RERA portal. The scope is limited to the publicly displayed agent name plus all available contact details—phone numbers, email addresses, and office addresses. No licence-status fields or property listings are required. Any stack is fine—Python (BeautifulSoup, Scrapy, Selenium), Node.js, or a headless browser workflow—as long as it handles pagination, hidden rows, or JavaScript-rendered tables and respects polite scraping practices. Deliverables • An Excel workbook (.xlsx) containing one row per agent and clearly labelled columns for Name, Phone, Email, and Address • Data fully deduplicated, UTF-8 compliant, and free of blank placeholders • A short note on th...

    $13 Average bid
    $13 Avg Bid
    44 bids

    I need two separate, non-blocking crawlers—one targeting Amazon, the other Flipkart—each operating with our seller accounts. Every 24 hours the system must fetch fresh data for the ASINs I supply: product name, seller name, star rating, ratings c...source code for both crawlers • MySQL schema and any migration scripts • Configuration for scheduling (cron, systemd timer, or equivalent) • Self-healing and retry mechanisms baked in • Email notification module with simple SMTP settings file • README covering setup, environment variables, and how to add new ASINs without touching the code Feel free to suggest the best stack—Python with Scrapy or Playwright, Node.js with Puppeteer, or another proven toolset—as long as it stays he...

    $77 Average bid
    $77 Avg Bid
    32 bids

    Top scrapy Community Articles