Beautifulsoup jobs
...that appears on each listing (titles, descriptions, prices, SKU codes, category paths and any other on-page text detail), then supply the data back to me in a spreadsheet-ready format such as CSV or XLSX. I do not need images—only the textual content. Any solution you craft must respect the site’s and avoid rate-limit issues so nothing gets blocked. A Python script using requests/BeautifulSoup, Scrapy, or a headless browser (Selenium, Playwright) is perfectly fine as long as your code is clearly commented and reusable; if you prefer another language, that works too provided the result meets the same standards. Deliverables: • Complete dataset of all product listings in CSV or XLSX • The scraping script with brief run instructions • One short repo...
I need a clean, one-time extraction of every registered agent listed on the Kerala RERA portal. The scope is limited to the publicly displayed agent name plus all available contact details—phone numbers, email addresses, and office addresses. No licence-status fields or property listings are required. Any stack is fine—Python (BeautifulSoup, Scrapy, Selenium), Node.js, or a headless browser workflow—as long as it handles pagination, hidden rows, or JavaScript-rendered tables and respects polite scraping practices. Deliverables • An Excel workbook (.xlsx) containing one row per agent and clearly labelled columns for Name, Phone, Email, and Address • Data fully deduplicated, UTF-8 compliant, and free of blank placeholders • A short note on th...
...supplier that appears on bridebook.co.uk. We need the suppliers of each category from the whole UK, so it will be roughly 14,000 records. What we actually need in the spreadsheet is very simple: the Suppliers Names & Emails. Nothing more. Please deliver one Excel file with two clearly-labelled columns (Name | Email). We have no preference on how you collect the data—manual collection, Python/BeautifulSoup, Selenium, or any other web-scraping approach is fine—as long as the final sheet is accurate and complete. Don't just place your BID blindly, rather visit the link and explore Suppliers Category section - there are many like Florist, Music, Videographers, etc. Fixed Budget: $50 We have certain strict rules to abide by so those who follow those strictly ...
I have website that holds roughly a thousand individual entries I need pulled down. Each record contains only t...thousand individual entries I need pulled down. Each record contains only text fields—no images, media, or attachments—so the task is strictly text-based scraping. Once you capture the data, just hand it over in a single CSV file; I don’t need any additional filtering, re-formatting, or post-processing on your side. For transparency and future reuse, a short, well-commented script (Python with BeautifulSoup, Scrapy, or a similar tool) that reproduces the extraction would be appreciated, but my primary deliverable is the finished CSV. If you can move quickly and ensure accuracy—no missing rows, clean line breaks, and UTF-8 encoding—I’...
...data accuracy: After fixing the errors, ensure that all data presented in the website is accurate and correctly reflects the information from the respective travel agencies. Collaborate on requirements: Discuss specific requirements and understand the nature of the data discrepancies before implementing fixes. Key Skills Required: Web Scraping: Experience in web scraping frameworks (e.g., BeautifulSoup, Scrapy, Selenium). Data Analysis: Ability to work with data structures, data cleaning, and presentation. Familiarity with Pandas or similar tools is a plus. Backend Development: Proficiency in backend technologies (e.g., Python, Node.js) to handle web scraping and data processing. Frontend Development: Experience with frontend technologies (e.g., React, HTML/CSS, JavaScript) to ...
I need a clean data set built from the “List of tools and equipment” page on Wikipedia. For every entry that sits under the categories I’m after—Hand tools, Power too...every image you reference and place them in a single zipped folder, keeping the original file names intact so they match the CSV row. I’ll review the work by opening the CSV in Excel, checking that each row populates correctly, verifying that image links load in a browser, and confirming that the zipped archive contains a matching file for every URL. Use whichever stack you prefer—Python with BeautifulSoup/Scrapy and pandas is fine as long as the code is reliable and UTF-8 safe. Let me know if any image is missing on the source page so we can decide whether to skip or note it. ...
...product information from several e-commerce websites. The focus is on two key data points: • Product name and full description • Customer reviews and ratings Price and availability are not required this time, so the crawler can ignore any endpoints related to stock or cost. Please build the script so I can run it on demand and easily point it at new store URLs in the future. Python with BeautifulSoup, Scrapy, or a similar framework suits me fine, as long as the code is clean, well-commented, and leverages polite scraping practices (respectful delays, user-agent rotation, handling captchas when possible). Deliverables: 1. Working scraper code with clear setup instructions 2. Sample CSV or JSON export containing the requested fields 3. Short README explain...
...following platforms: - Udemy - Coursera - YouTube The collected dataset should contain the following information for each record: 1. Course / Tutorial Title 2. Course URL / Video URL 3. Platform Name (Udemy / Coursera / YouTube) 4. Category (Technical or Non-Technical) 5. Paid or Free Indicator Technical Expectations: - Data scraping can be performed using tools such as Python, BeautifulSoup, Selenium, Scrapy, or other efficient scraping frameworks. - The freelancer must handle pagination, dynamic loading, and scrolling where required. - Duplicate entries should be avoided. - The final dataset should be clean and well-structured. Output Format: The final deliverable should be provided in one of the following formats: - CSV - Excel - JSON The dataset...
...the text content pulled from a small set of public websites. Each site is straightforward (no log-ins or CAPTCHAs), and I only require the visible textual information; images, links, or other assets are not part of the scope. Please capture the text accurately and return it in a clean, structured format that I can sort or filter later—CSV or Excel is fine. A lightweight Python script using BeautifulSoup, Scrapy, or a similar library is ideal so I can rerun the extraction if the pages change. Include clear comments in the code and a brief read-me so I understand how to execute it on my end. Deliverables: • Well-documented scraping script • Extracted text for each website in CSV/Excel • Short instructions for reruns or small tweaks The job is simple and ...
valid email addresses harvested from a list I'll give you. Objective: outreach to sell books. You might use the stack you are most comfortable with Python + BeautifulSoup, Scrapy, Selenium, Node.js + Puppeteer, or similar 'as long as the final data is clean and deduplicated' Deliverables: • A CSV file containing each email address alongside the exact page URL where it was found • A brief note on the toolchain or script used (for reproducibility) Accuracy matters more than sheer volume Do Not give me a bloated list full of bounces.. I only neeed proper clean verified and contactable info.
...website. The information to capture will centre on business details, such as registration numbers, business name, registration date, address, etc. The workflow should: • Navigate every relevant section of the site (pagination, search filters, subsidiary pages). • Extract the required fields accurately • Export clean, structured data to CSV and JSON A Python solution leveraging requests/BeautifulSoup or Scrapy is preferred, but I’m open to other dependable stacks if they handle rate-limits, retries, and potential CAPTCHA gracefully. The script must be easy to rerun on demand, with clear instructions for environment setup and any dependencies. Acceptance criteria will be a sample scrape of 500 records that match the live site exactly, plus the commen...
I need a complete scrape of that covers masonr...rate-limiting so we stay within acceptable request volumes. Deduplication is essential—if the same company appears under multiple categories or listings, merge the records instead of inflating the count. Deliverables • One clean .xlsx file containing all requested fields, ready for filtering and analysis • A brief text log explaining the scraping workflow, libraries used (e.g., Python–Selenium/BeautifulSoup, Node–Puppeteer, etc.), and any known data gaps • Confirmation that the crawl completed for every U.S. state and territory, without regional bias I will review the spreadsheet for completeness, spot-check against live BBB pages, and verify there are no broken rows or inconsistent column h...
I have around 7,000 Aliexpress products that I need fully harvested for content-creation purposes. From each listing I only require the official product photos and an...product URL or SKU will make downstream editing much easier for me. Deliverables: • Folder structure or archive segmented by product (one folder per listing). • Inside each folder: all JPEG images and any MP4 videos found. • A simple CSV mapping product URL → asset file names so I can trace anything quickly. I’m happy for you to choose the most efficient tooling—Python with Selenium, BeautifulSoup, or similar headless solutions are fine—as long as the final package is complete and safely transferable via cloud link. Let me know your estimated turnaround time and any clarifi...
...publicly visible. Because many conference sites load content dynamically, the scraper must be able to render JavaScript when needed; a headless Selenium setup or an equivalent solution in Python 3 is fine. Please structure the code so I can add or remove conference URLs easily, and store all results in a clean CSV (UTF-8) or Excel file. Deliverables • Well-commented Python script (requests/BeautifulSoup for static pages, Selenium for dynamic ones) • and a short README describing setup and usage • One sample output file created from at least one conference URL I provide for testing I’ll review the job as complete once the script reliably extracts the fields above, handles pagination where applicable, and finishes without runtime errors on the test ...
...isn’t cleaned properly (need help with parsing and formatting). - The code runs slowly for large datasets (need optimization). - The code doesn’t handle certain edge cases where the webpage layout changes (need fixes for those). I will provide access to the code and sample data to help you diagnose the issues. Requirements: Expertise in Python (especially with web scraping libraries like BeautifulSoup, requests, Scrapy, etc.). Strong understanding of data parsing and optimization techniques. Experience in handling edge cases in scraping. Timeline: I need the bugs fixed within 3 days. The sooner, the better! Budget: $30-$50, depending on how quickly the issues can be resolved....
...details plus up-to-date owner contact information. will serve as a second source so the final list contains additional prospects pulled from that platform (matching or related to the same asset classes). All data must be scraped, deduped, and formatted in a single spreadsheet so I can sort, filter, and launch campaigns immediately. Use whatever stack you prefer—Python, Selenium, BeautifulSoup, Apify, or similar—but the workflow has to respect each site’s TOS and deliver reliable results. Deliverables • CSV/Excel file with Propstream property details and owner contacts for commercial buildings and apartments • Separate tab or merged columns with the corresponding leads • Basic data hygiene: no duplicates, valid phone/email formats, c...
...fuel-economy figures, in-car technology features, seating layouts and any other attributes exposed on the page. The end goal is a clean, analysis-ready Excel workbook that lets me run market-wide comparisons, so consistency is critical: headings must be standardised, units normalised and categorical values written the same way across the entire sheet. I am happy for you to use Python, Scrapy, BeautifulSoup, Selenium, AI-assisted extraction—whatever combination you trust—to pull the information, as long as the final file is accurate and complete. Data standardisation is essential. To keep things efficient I’d like a small sample delivered early so we can confirm structure before you harvest the full set. Once the sample is approved, scrape the remaining URL...
...target site automatically, including date pickers and location selectors. • Rotate user-agents / proxies or apply any other anti-bot tactics necessary to stay undetected. • Capture and log errors so a failed request never silently drops a row. • Be easy for me to rerun on demand—command-line or small web UI is fine, as long as setup is straightforward. I’m comfortable with Python (BeautifulSoup, Selenium, Playwright, Scrapy, etc.) or another language you can justify, as long as you hand over all source code, dependency files, and a quick start README. A brief demo video or screenshots validating the scraper against at least one aggregator and one direct company site will serve as the final acceptance test. When you bid, point me to a project...
...required. Core job • Navigate through all product listings on the site, follow pagination, and fetch fields such as product name, model/SKU, description, category, list price, discount price (if present), currency, and product URL. • Store the results in both CSV and JSON so I can easily import them into our internal tools. Technical expectations • Python 3.x with either Scrapy or BeautifulSoup/Requests; Selenium is acceptable only if the target pages rely heavily on JavaScript. • Respect and add polite throttling plus user-agent rotation to avoid blocking. • Code should be modular and ready for me to change the target domain or output path by editing a single config file. • Include a short README that explains how to install depen...
...management has replied, both the reply text and the reply date At least half of the collected reviews must include a management response. The final deliverable is a clean, well-structured CSV that combines all fields. No language filtering is needed—capture reviews in any language exactly as they appear, keeping accents and special characters intact. You can use Python with Scrapy, Selenium, BeautifulSoup or whichever stack you prefer, provided you respect Tripadvisor’s loading patterns, pagination and dynamic elements so nothing is missed. Please maintain strict accuracy: no duplicated rows, correct hotel-to-review matching, and consistent field order. When you reply, attach or link to one or two brief samples from similar scraping projects (CSV snippets or sc...
...birth, nationality, position, current club, height/weight (where listed), and any notable career highlights that appear on the player’s own site. Because these pages vary in structure, the code should be resilient: graceful error handling, user-agent rotation, and clear selectors or XPath rules that are easy for me to extend later. I’m comfortable running Python, so libraries like Requests, BeautifulSoup, Selenium, or Scrapy are welcome; please choose the stack that gives the best balance of speed and maintainability. Deliverable • A runnable script (with a brief README) • The resulting CSV generated from a short test run (5–10 players is fine for proof) • Comments in the code explaining each major step Acceptance • Script executes fro...
...(possible 5 stages: Entrance, Plate, Drink, Dessert, Coffee) but it can be only 2 stages out of 5 (example: Plate+Drink) Please also tag each record with its district and municipality so the file can be filtered regionally. Deliverables 1. A single CSV or Excel file containing one row per restaurant with all fields above clearly labelled. 2. The script or notebook you use (Python with BeautifulSoup, Scrapy, Selenium, or any other tool you prefer) so I can rerun the scrape later. Acceptance criteria • No duplicate restaurants. • All mandatory columns populated where data exists on Google. • At least 95 % of entries correctly classified for fixed-price menu status and pricing. Keep the approach respectful of Google’s terms of service. If you ...
I have a spreadsheet (~5,000 product rows) where each row contains a full HTML eBay listing template. Each row includes: ID SKU Description Short description The Description field contains a large block of HTML (decorative listing template), but the actual product description is embedded inside it. Your job is to extract the correct text. 1. Ext...final text Clean up any leftover spacing 4. Final Output Write the cleaned text into the Description column Completely clear the Short description column for all rows Do not modify the SKU column Deliverable One cleaned Excel file with: Cleaned Description column Empty Short description column Requirements Must be completed programmatically (Python preferred) Experience parsing HTML (e.g., BeautifulSoup or similar) Strong attention...
...coverage of all pages and pagination is essential. Please enter every entry in a separate column so we can analyze it easier. Deliverable • One .xlsx file containing a single sheet with the cleaned dataset As soon as the file opens without errors and every listed tractor is represented with the five fields correctly typed, the job is finished. If you already have experience with Python (BeautifulSoup, Selenium, Scrapy) or similar scraping tools and can turn this around quickly, let me know your timeframe....
...product information from a specific website and drops it into a neat, well-structured Excel workbook each month. The data points are the standard e-commerce essentials—name, price, SKU, description, availability and any variants that appear on the page. If the site nests details behind dynamic elements, please factor that in; I still expect a complete dataset. Your script can run in Python (BeautifulSoup, Scrapy or Selenium are all fine) or any language you prefer, as long as the final output is a tidy .xlsx file ready for analysis. I’ll trigger the run once a month, so the process should be repeatable with minimal manual tweaking—ideally a single command or scheduled task. Acceptance criteria • A working scraper that navigates pagination and captures e...
...JSON or CSV, and drop it straight into a folder or database I point you to. I already have the list of news domains and sample URLs. Your code should: • respect and rate limits, • rotate user-agents / proxies if a site blocks frequent requests, • be easy to extend when a new site is added, and • run headlessly from a cron job or similar scheduler. Python with Scrapy, BeautifulSoup, or Playwright is preferred, but I’m open if you can justify another stack. Clear inline comments plus a short README are essential so I can maintain the scraper myself after hand-off. Please include a quick demonstration—scrape five sample articles and provide the resulting JSON so I can verify the field mapping. I’ll consider the project complete whe...
...decision-making. Project Scope: Scrape data from specified websites (details will be provided) Extract relevant fields (e.g., names, emails, prices, listings, etc.) Clean and structure the data (CSV, Excel, or database format) Ensure data accuracy and avoid duplicates Handle pagination, dynamic content, or login (if required) Requirements: Strong experience with web scraping tools (Python, BeautifulSoup, Scrapy, Selenium, etc.) Ability to handle anti-bot protections if needed Experience with data cleaning and formatting Attention to detail and reliability Nice to Have: Experience with automation and scheduling scripts Knowledge of APIs (if available instead of scraping) Deliverables: Clean, well-structured dataset Scraping script (optional but preferred) Documentation on ...
...each address through an SMTP-level verifier (ZeroBounce, NeverBounce, or an in-house Python verifier—whichever you prefer, as long as it returns status codes for valid, invalid, catch-all, disposable, and role accounts). 4. Output only “valid” or “catch-all” emails in a downloadable CSV along with their metadata. • Technical notes - I’m comfortable with Python (Scrapy, Requests, BeautifulSoup, Selenium) or Node (Puppeteer, Cheerio); choose whichever stack you can scale and maintain. - Respect Google’s ToS with rotating residential proxies or a paid SERP API to avoid blocking and captchas. - The job should run via a daily cron or cloud function and log results to a lightweight dashboard (even a simple Flask/Expr...
I'm looking for a skilled web scraper to extract product images and descriptions from Yupoo. The scraped data should be organized and delivered in a CSV file. Requirements: - Experience with web scraping tools and technologies - Ability to handle dynamic content on Yupoo - Attention to detail to ensure data accuracy - Proficient ...descriptions from Yupoo. The scraped data should be organized and delivered in a CSV file. Requirements: - Experience with web scraping tools and technologies - Ability to handle dynamic content on Yupoo - Attention to detail to ensure data accuracy - Proficient in data organization and CSV formatting Ideal Skills: - Python or similar programming languages - Familiarity with libraries like BeautifulSoup or Scrapy - Previous experience scraping eco...
Hi, I need a data scraper who can scrap a data from provided sources. Skills : Core Technical Skills The freelancer should know Python (the most common scraping language) with libraries like Scrapy, BeautifulSoup, or Playwright. They should also be comfortable with browser automation tools like Selenium or Puppeteer (JavaScript-based), since sites like TipRanks are JavaScript-heavy and need a real browser to render. Anti-Bot Bypass Experience This is the most critical skill for your specific case. Look for someone experienced with handling CAPTCHAs (2Captcha, Anti-Captcha services), rotating proxies and residential IPs, spoofing browser headers and fingerprints, and bypassing Cloudflare or similar bot protection. TipRanks specifically uses these protections, so this experience ...
A reusable Python script is required to automate data scraping from a series of publicly accessible web pages. The script should accept a list of URLs, navigate through any paginated content, extract the specified fields, and save the results to CSV and JSON. The task suits someone with an intermediate grasp of Python who is comfortable working with libraries such as requests, BeautifulSoup, pandas, or, when a site relies on JavaScript, Selenium or Playwright. Clear, well-commented code and concise setup instructions are essential so the script can be dropped into an existing workflow without modification. Acceptance criteria and deliverables: • Fully functional .py script that runs from the command line. • Configuration section (or .env file) for URL list and fiel...
I want to bring together airline revie...schema optimised for quick filtered queries on large text fields. • Search interface with filter controls (think airline, date range, star rating, sentiment, etc.) that returns results fast and can handle pagination. • Simple admin or dashboard screen where I can verify scraper status, trigger a manual update, or adjust the fetch schedule. I am open to your preferred tech stack—Python with BeautifulSoup or Scrapy for retrieval, Node or Django on the back end, ElasticSearch or PostgreSQL for search indexing, React or plain Vue for the front end, whatever you feel offers the best balance of maintainability and performance. Please outline which tools you would use, how you will structure the update schedule, and an estimated...
...straightforward: the bot finds a list of channels, fetches the email field inside each description, checks that it hasn’t mailed that address before, and then fires off the tailored message. It should pause or queue itself so we never exceed 50 sends in a 24-hour window and respect reasonable delays between messages to stay under Gmail / SMTP limits. I’m comfortable if you code it in Python—Selenium, BeautifulSoup, or the official YouTube Data API are all fine—as long as the final script runs on my Windows laptop or a lightweight VPS. Configuration for my SMTP credentials must be simple and secure (preferably loaded from an environment file). A small log file or dashboard that shows which channels were processed, whether an email was found, and the send s...
...Responsibilities Debug and fix the existing scraping script Identify and resolve issues causing failures or incomplete data Ensure stable and consistent data extraction Handle pagination, dynamic content, or anti-bot protections if needed Improve performance and efficiency of the scraper Deliver clean, structured output (CSV / JSON / database) Technical Requirements Strong experience with Python (BeautifulSoup, Scrapy, Selenium, or similar) Experience handling JavaScript-rendered websites Familiarity with proxies, headers, and anti-bot bypass techniques Ability to troubleshoot and optimize existing code Experience with data formatting and storage Deliverables Fully working and stable scraping script Clean and structured dataset output Documentation on how to run and maintain the ...
...published phone number and/or email exactly as it appears in the ad. • Note the ad’s URL, title, price and posting date so my client can reference it. • Deliver everything in a simple CSV or Google Sheet that I can forward directly. I am not asking for a one-size-fits-all scraper; the job is a personalised contact-finding service. If you prefer to automate parts of the process with Python, BeautifulSoup, Selenium, Apify, etc., that is fine as long as the results stay accurate and Leboncoin’s terms of use and rate limits are respected. Turnaround is important to me: once I drop you a batch of links or search criteria, I need the contact file back the same day (or within the hour for smaller batches). Fluency in French and an eye for detail are essential...
...when Google exposes it Flow I have in mind 1. Front-end search box (simple HTML/JS is fine). 2. Back-end service that grabs the requested place page, extracts the data above, and returns JSON. 3. Results rendered in a clean table or card layout directly in the browser. No extra analytics or reports are needed right now. Technical notes • Runs on my existing Linux VPS. A Python (BeautifulSoup + Selenium/Playwright) or Node (Puppeteer) stack is perfectly acceptable—choose whichever can handle dynamic content and the popular-times chart reliably. • Please include an easy setup script or Dockerfile so I can deploy in one step, plus concise instructions on how to add my own Google session cookies or proxies if required to avoid blocking. • I&rsqu...
...started throwing runtime errors before any data is returned. The pages themselves load fine in a browser, so the problem is clearly within my code or its dependencies. Here’s what you can expect from me: a zipped folder containing the current Python script, the list of target URLs, and a copy of the last error stack trace. I’ll also let you know which Python version and libraries (requests, BeautifulSoup, etc.) are in use so you can replicate the issue quickly. What I need from you: • Diagnose the exact cause of the errors • Deliver a clean, well-commented fix that reliably fetches all required text from each page • Briefly outline any library or environment changes I must make to keep the scraper stable in the future Once the script runs end-t...
...the app through its REST/GraphQL API, on a daily or weekly schedule I can adjust For catalogs, be ready to download PDFs or scrape product pages, then pull out spec sheets and key attributes. Everything must remain traceable back to the original URL or document. Build the agents end-to-end—scraper, NLP enrichment, database, scheduler, and API integration. Python (Scrapy, LangChain, OpenAI, BeautifulSoup), or a comparable stack, is ideal but not mandatory if you can deliver the same reliability and speed. Timing is critical; I’d like a working MVP ASAP, followed by refinements once we see real data flowing. Please share a concise plan, the toolset you prefer, and examples of similar automations you have already deployed....
...country, full duration, tuition fees and all stated entry-requirement details. Compile the results in one tidy Excel workbook, single sheet, with each row representing a unique program and each column matching the fields above. Accuracy matters: no duplicates, consistent currency symbols, and clean text with no hidden HTML tags. If you gather the data with an automated script (Python + BeautifulSoup/Selenium or similar) that you can hand over alongside the spreadsheet, even better—it will let me refresh the list later. On delivery I will quickly verify random samples against the live site, so the sheet should be ready to pass that check without edits. Deliverables: • .xlsx file containing all scraped programs in one sheet • (Optional but appreciated) the ...
...rent comps, calculate projected NOI, cap rate, cash-on-cash and simple IRR, then package the results in both a downloadable spreadsheet and a JSON payload so I can drop the file straight into my CRM. Editable assumption cells for vacancy, expense ratio, financing terms and exit cap are essential so I can fine-tune scenarios on the fly. I am language-agnostic, though Python (Pandas, Requests, BeautifulSoup or Selenium for scraping, plus Jupyter for quick tweaks) or JavaScript with Node and relevant SDKs would be easiest for me to maintain. If an official API is required I will provide keys; otherwise build a scraper that respects rate limits and captcha challenges. Deliverables: • Full source code with clear in-line comments • Setup guide and / • Example CSV
...category, a sub-folder for every product, named with the product title or SKU • Within that folder: – a text file (UTF-8) containing the product description exactly as it appears online – all product images saved as JPG or PNG, keeping their original quality and naming them sequentially (image-1, image-2, …) Feel free to use the web-scraping stack you’re most comfortable with—Python, BeautifulSoup, Selenium, Scrapy, or equivalent—as long as the final structure matches what’s outlined above and every product on the site is included. Before delivery, spot-check at least 10 random products to confirm that descriptions are complete, images are viewable, and folder names are accurate. The products should match the...
...refine) a scraper that reliably pulls pure text data from the official Supreme Court of India website as well as each High Court site. The crawler must respect , handle pagination, and normalise judgments, orders and daily cause-lists into a clean structure before inserting them straight into our existing database (PostgreSQL). If you prefer Python, feel free to combine requests, BeautifulSoup, Selenium or Playwright—whatever keeps it stable and headless-friendly. 2) Trace and fix the broken “Research” feature inside the AI tool. It currently fails when it tries to query the new tables; the bug appears in the indexing layer, not the LLM itself. Once the scraper has seeded fresh data, the search endpoint should return relevant citations in under two seconds. I&r...
I’m building targeted outreach lists and need an experienced web-scraper or data-miner to pull fresh, accurate contact information for four potential...your own. Accident reports may come from either side of the Atlantic, while all other datasets must be strictly U.S.-based. Within the United States I have no regional preference—feel free to harvest from any state that yields the volume and quality we’re after. Please outline in your bid: • Your scraping approach and the specific tools, APIs, or custom scripts you plan to use (Python, BeautifulSoup, Scrapy, Octoparse, etc.). • A brief example of similar lead-gen or public-record extractions you’ve completed, preferably showing match rates or verification steps. • Estimated record count...
...data—primarily HTML tables—from a target website and load the results straight into a database. The final dataset must be complete, tidy, and ready for querying. While the focus is on table-based information, the site also contains a few supporting images that I want captured in JPEG format and referenced correctly in the data you store. I’m flexible on language and tooling, though Python with BeautifulSoup / Scrapy, Node with Cheerio, or similar frameworks are welcome so long as the code is clean, well-commented, and can be rerun without manual tweaks when the site updates. Deliverables: • A working scraper with source code • A populated database (MySQL, PostgreSQL, or SQLite—use what best fits; include the schema) • Stored J...
...status status code error post column name post text word count flag reason User interface Build a simple web app with: workbook upload reporting mode selector complete workbook option week selector start date and end date selector run validation button progress display summary preview download buttons for Excel, Word, and PDF Preferred stack Python Streamlit or Flask pandas openpyxl requests BeautifulSoup python-docx ReportLab, WeasyPrint, or HTML to PDF Functional details Workbook handling support multiple worksheets ignore empty sheets safely handle inconsistent headers where possible log worksheet level issues without stopping the run Date and week handling parse mixed date formats safely support Excel date cells and text dates normalize week values where possible prese...
...Companies) State Company Type (Private Limited / LLP / OPC / Public Limited) Authorized Capital (if available) Registered Office Address (if available) The system should: Run automatically daily (cron / scheduler) Avoid duplicate records Export data to Excel / CSV / Google Sheets / API endpoint Handle captcha or dynamic website structure if applicable Preferred technologies: Python (BeautifulSoup / Scrapy / Selenium) Node.js Puppeteer / Playwright Any robust scraping framework Deliverables: Fully working scraper or API system Source code Documentation for running the script Optional: Dashboard or automated email delivery of daily data Additional Preferred Features (Bonus): Historical data scraping Cloud deployment (AWS / DigitalOcean / VPS) API endpoint to ...
...script must capture the course title, full description, instructor name or bio, and all published schedule or date details, then output everything in a clean, structured file (CSV or JSON works for me). The sites vary slightly in layout, so the code should gracefully handle both static HTML and any light JavaScript a page might rely on. Python is my preferred stack; feel free to use Scrapy, BeautifulSoup, Selenium or similar libraries as needed, provided the final solution runs headless and unattended on a Linux server via cron. Deliverables • Well-commented source code and • A sample output file showing one full scrape from each site • Setup notes so I can schedule the daily run on my end I’ll consider the project complete once I can execute the jo...
...configured websites, APIs and RSS feeds are queried automatically on the defined schedule. 2. Duplicate entries are detected and suppressed. 3. A single JSON response returns merged results in under one second for a 100-item request on standard hosting. 4. Setup instructions let me reproduce the build from scratch in less than 30 minutes. Feel free to choose the tech stack—Python (Scrapy, BeautifulSoup, FastAPI), Node.js (Cheerio, Axios, Express) or something comparably mainstream—as long as you specify versions and any open-source libraries you use. Please include a brief outline of your approach, the tools you prefer, and an estimated timeline....
I’m looking for a clean command-line script that reliably pulls product prices from Wefix and Save.co. What I need: • A single execu...structured with product name, current price, timestamp, and source URL. • Robust scraping logic that handles pagination, typical anti-bot measures, and polite rate limiting. No headless browser unless necessary. • Modular design so I can easily plug in extra sites later—Amazon, eBay, Walmart, or others—without rewriting core code. • Clear README with setup steps, dependency list (requests/BeautifulSoup, puppeteer, etc. as appropriate), and sample commands. • Basic logging plus error handling so failures don’t silently pass. Deliverables: the full source code, a short setup guide, and an...
I...data points: • First name • Last name • Company web address • Direct email address • Personal LinkedIn URL I’m interested only in real owners or solo founders, not general “info@” or generic company contacts. Please structure the output in a spreadsheet (CSV or XLSX) with one row per person and separate columns for every required field. Use whatever stack you prefer—Python with BeautifulSoup, Scrapy, Selenium, or your own tooling—so long as it respects platform terms of service and returns accurate, current information. I’ll review a small sample first to confirm formatting and data quality; once approved, you can complete the full scrape. Let me know how many records you realistically expect to deli...