
In Progress
Posted
Paid on delivery
I am looking for an experienced Apify specialist to execute a highly targeted, small-scale data extraction project. The goal is to collect 200–300 high-signal records from specific professional discussion forums and community platforms. This is not a complex web development project. I have already created a an execution plan. Your job is to configure the tools in my workspace, execute the runs cleanly, and normalize the exported data. Scope of Work: Task Setup: Configure specific Apify actors (primarily Web Scraper and Reddit scrapers) directly within my Apify workspace so I retain the assets. Execution: Run pre-defined search queries (which will be provided upon hire) across multiple platforms. Strict Filtering: Apply post-run filters to the dataset. For example, on certain platforms, you must filter the dataset to only include records where the author possesses specific verified credential flairs. Data Normalization: Clean the raw export before final delivery. This includes deduplicating records and mapping synonym terms to a canonical name (e.g., mapping various abbreviations to a single standard name). Delivery: Provide the final cleaned dataset in both JSON and CSV formats. Target Sources: Reddit (specific subreddits, filtering for verified credentials) Doximity / OpMed Medscape HealthUnlocked Pharmacy Times Requirements & Milestones: To ensure quality and alignment, this project will be strictly managed via milestones. Do not bid if you cannot agree to the following workflow: Milestone 1: Delivery and verification of Tier 1 sources (~100 records). Milestone 2: Delivery and verification of Tier 2 sources (~100-150 records) and final data normalization. To Apply: Please start your proposal with the word "CANONICAL" so I know you read this entire posting. In your proposal, briefly confirm your experience with Apify and state how you handle data normalization (e.g., Excel, Python, Pandas) for the final CSV delivery.
Project ID: 40377490
46 proposals
Remote project
Active 1 mo ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
46 freelancers are bidding on average $139 CAD for this job

Youssef, Full-Time Python Developer with deep expertise in Apify and targeted data extraction. I will configure your specified Apify actors like the Web Scraper and Reddit scrapers directly in your workspace as required. My approach uses Apify for the primary runs, then I apply strict post-run filtering for credentials and normalize the data using Python and Pandas for deduplication and mapping terms to a canonical standard. I have successfully completed numerous projects involving precise data extraction and cleaning from similar forums. To proceed, could you share the pre-defined search queries for the first tier? Ready to start immediately.
$250 CAD in 1 day
7.3
7.3

CANONICAL. Hi there, I’ve carefully reviewed your project and I am confident I can execute this Apify-based, small-scale high-signal data extraction system exactly as specified, focusing on clean configuration, accurate filtering, and fully normalized structured output. My approach begins with setting up and configuring the required Apify actors directly inside your workspace (primarily Web Scraper and Reddit scrapers), ensuring all runs are properly parameterized using your predefined search queries. This keeps the system fully transparent and reusable within your environment. Next, I will execute controlled runs across each target platform, applying strict post-processing filters to ensure only qualified records are retained (including credential-based filtering where applicable, such as verified flairs on Reddit or equivalent indicators on other forums). This step ensures signal quality over raw volume. After extraction, I will perform full data normalization using Python (Pandas) to clean, deduplicate, and standardize fields. Finally, I will deliver the dataset in both CSV and JSON formats, aligned with your milestone structure (Tier 1 first, then Tier 2 with final normalization). What level of strictness do you want applied when a record is partially qualified, should I exclude it entirely, or retain it with a “low-confidence” flag for review? I’m ready to start immediately and execute Milestone 1 first. Warm regards, Aneesa.
$150 CAD in 1 day
6.8
6.8

Hello, I have over 7 years of experience in Data Processing, Data Collection, and Data Mining. I have carefully read your project requirements and am confident in my ability to execute the tasks outlined. For this project, I will configure the specific Apify actors in your workspace, run the pre-defined search queries across multiple platforms, apply strict filtering criteria, and normalize the exported data. I will ensure that the final cleaned dataset is delivered in both JSON and CSV formats, meeting your requirements. I am well-versed in using Apify for data extraction and have expertise in data normalization using tools like Python and Pandas for CSV delivery. Let's discuss further details in the chat to align on the project specifics and milestones. You can visit my Profile: https://www.freelancer.com/u/HiraMahmood4072 Thank you.
$100 CAD in 2 days
6.4
6.4

Hello there, CANONICAL — I will configure the Web Scraper and Reddit actors in your Apify workspace, execute your predefined queries, and deliver filtered, normalized datasets in JSON and CSV. For normalization, I will use Python with Pandas — building a synonym mapping dictionary so abbreviations and variant terms resolve to one canonical name before deduplication. This makes the pipeline repeatable if you ever expand sources. Questions: 1) Do the non-Reddit sources (Doximity, Medscape) require authenticated sessions, or is public content sufficient? 2) For the credential flair filtering on Reddit — do you have the exact flair strings, or should I extract all unique flairs first for your review? Ready to start whenever you are. Kamran
$125 CAD in 5 days
6.0
6.0

Hello Greetings, After reviewing your project description, I am confident and excited to work on this project for you. a, I have some crucial points and questions to clarify. Please leave a message in the chat to discuss this, and I can share my recent work that is similar to your requirements. I am excited to hear from you soon. Thank you!
$140 CAD in 7 days
6.0
6.0

CANONICAL Hello, I have hands-on experience working with Apify for targeted data extraction, including configuring actors like Web Scraper and Reddit scrapers directly within client workspaces to ensure full ownership and control. I’ve executed similar small-scale, high-signal scraping projects where precision mattered more than volume—applying strict filters (e.g., verified flairs, role-based identifiers) and ensuring clean, relevant datasets. I’m comfortable running structured queries across multiple platforms and handling post-run filtering within Apify as well as externally when needed. For data normalization, I typically use Python (Pandas) for deduplication, schema standardization, and mapping synonyms to canonical values. For simpler cases or client preference, I also use Excel/Google Sheets with structured validation and cleanup workflows. I will ensure your final dataset is clean, deduplicated, and delivered in both JSON and CSV formats, fully aligned with your milestone requirements. I’m comfortable working within your defined workflow and can deliver accurately for both milestones with clear validation at each step. Warm regards, Harpreet Singh
$75 CAD in 5 days
5.5
5.5

CANONICAL As an experienced software engineer with a deep understanding of data management and processing, I believe I could be the ideal fit for your Apify project. By setting up specific Apify actors in your workspace and running pre-defined search queries across multiple platforms, I can confidently navigate the bounds of this project and successfully deliver high-signal records from professional discussion forums and community platforms. My expertise in JSON and CSV normalization will guarantee a clean export before final delivery, streamlining the analysis process for you. Data filtering will not be an issue as I've conducted numerous projects that focused on applying strict filters to extracted datasets. This is why I'm confident in my ability to effectively filter your dataset to only include records that meet the verified credential criteria on different platforms. Given my unique blend of skills including API, data analysis, management, processing, and web scraping, I am equipped with a thorough comprehension of the Apify landscape as well as competence when it comes to working with multiple data sources. Trust me to complete this project efficiently without compromising on quality or accuracy— because that’s who I am and what I do. Let’s bring your vision of seamless targeted data extraction into reality! Can't wait to work with you!
$130 CAD in 5 days
5.3
5.3

I can configure and run Apify actors in your workspace, execute targeted queries, apply strict filtering, and deliver clean, deduplicated datasets in JSON/CSV. I use Python + Pandas for normalization (mapping synonyms, removing duplicates, structuring fields) to ensure high-quality output. Ready to handle both milestones with accurate results.
$140 CAD in 1 day
5.3
5.3

Hello there, we are a team of developers and we can do this project in no time. Please, send me a message to discuss the work. Thanks Ashish Kumar.
$140 CAD in 7 days
4.7
4.7

Hi, I have strong experience in Apify, web scraping, data extraction, JSON/CSV processing, and Python (Pandas) for data normalization. For this project, I can configure and run Apify actors directly in your workspace, execute your predefined queries across sources like Reddit and medical platforms, apply strict filtering (including credential-based selection), and deliver a clean, deduplicated dataset with normalized fields using Python and Pandas for consistent output. I have real hands-on experience handling structured scraping pipelines and post-processing workflows, so I can ensure clean milestone-based delivery with accurate and reliable data. You can expect clear communication, fast turnaround, and a high-quality result. Best regards, Juan
$140 CAD in 1 day
4.9
4.9

CANONICAL Hi there, I understand you need an experienced Apify specialist to execute a precise, 200–300 record extraction project across specialized forums like Reddit, Doximity, and Medscape. I have extensive experience configuring Apify Actors and am well-versed in setting up assets directly within a client's workspace to ensure you retain full control and ownership of the tools. My approach will follow your execution plan strictly, starting with the configuration of targeted search queries and the application of rigorous flair-based filtering to ensure only high-signal records from verified authors are captured. For Data Normalization, I primarily utilize Python and Pandas to handle the mapping of synonyms to canonical names and ensure perfect deduplication before final delivery. I am fully committed to your milestone-based workflow, delivering verified Tier 1 and Tier 2 records sequentially to ensure every entry meets your strict quality standards. Deliverables: Fully configured Apify Actors in your workspace, verified Tier 1 and Tier 2 source records delivered via milestones, and a final normalized dataset in JSON and CSV formats including canonical name mapping and deduplication. QUESTION: Are there specific "high-signal" keywords within the post content itself that should trigger an inclusion or exclusion during the filtering phase? Let’s chat and get started now! Regards, Shehwani.
$100 CAD in 1 day
4.6
4.6

As someone who has spent significant time immersed in the realm of web development and automation, I can state confidently that I am the perfect fit for your targeted data extraction project. I hold an extensive experience in using Apify tools like Web Scraper and Reddit scrappers – skills you're specifically looking for this project. Your "Execution" and "Strict Filtering" tasks align well with my automation-focused skill set, allowing me to efficiently and accurately curate records from these professional platforms. Moreover, my familiarity with data normalization, whether through Excel, Python, or Pandas, will ensure your raw exports are impeccably cleaned and formatted before final delivery. In choosing me for this project, you're benefiting from a freelancer who not just understands Apify but also possesses an exceptional skill set across multiple technologies like PHP, React.js, JavaScript, Python – all of which can come handy at different phases of the project. This breadth of knowledge helps me deliver not just technically sound work but solutions that are flexible and scalable. So if you want an efficient executioner of your project's roadmap, choose Anton for reliable results!
$140 CAD in 7 days
4.1
4.1

⚠️ If you're not happy, you don’t pay. ⚠️ Hi there, Thank you for checking my proposal and sharing the detailed project brief. I can build a precise and efficient data extraction system using Apify with a focus on accuracy and data normalization. I will deliver: • Setup Apify actors for Web Scraper and Reddit scrapers • Run search queries across multiple platforms • Apply strict post-run filtering • Normalize and clean the final dataset for deduplication and term mapping You will also receive: • Detailed documentation on the workflow • Training on utilizing the extracted data effectively I am confident I can execute your vision professionally and efficiently. Looking forward to discussing timeline and next steps. Best regards, Chirag.
$200 CAD in 7 days
3.8
3.8

CANONICAL I’ve worked with Apify extensively for targeted data extraction, including configuring actors (Web Scraper, Reddit, custom APIs) directly inside client workspaces to ensure full ownership and reproducibility. I can follow your execution plan precisely—setting up actors, running structured queries, and applying strict post-run filters (e.g., verified flair-based filtering on Reddit). I understand the importance of high-signal datasets and will ensure only relevant records are retained. For data normalization, I primarily use Python (Pandas) along with Excel for validation. My process includes deduplication, schema standardization, and mapping synonyms/abbreviations into canonical formats for clean analysis-ready output. I’m comfortable working milestone-based and will deliver: • Tier 1 (~100 records) clean + verified • Tier 2 (remaining records) + full normalization • Final dataset in JSON & CSV You’ll get well-structured, accurate, and ready-to-use data.
$100 CAD in 7 days
3.9
3.9

Hi there, CANONICAL. It looks like you need help with a targeted data extraction project using Apify, focusing on gathering 200-300 records from specific professional forums. With 4+ years of experience in web scraping and data processing, I can efficiently set up the required Apify actors and execute your plan while ensuring the data is clean and well-structured. I’m comfortable normalizing data using tools like Python and Pandas, which helps in deduplicating records and mapping various terms to their standard names. I understand the importance of filtering for verified credentials, especially on platforms like Reddit and Doximity, to ensure you get high-quality data. One question I have is: how do you plan to validate the records after each milestone to ensure they meet your expectations? Best regards, Arslan Shahid
$30 CAD in 3 days
3.7
3.7

Dear Client, I’m a full-stack developer with 10+ years of experience, including hands-on work with Apify actors, web scraping workflows, and structured data pipelines using Python and automation tools. I understand you need precise execution within your Apify workspace—configuring actors, running targeted queries, applying strict filters (e.g., verified flairs), and delivering 200–300 high-quality, normalized records. I’ve handled similar projects extracting niche datasets, ensuring accuracy through controlled runs and validation checkpoints aligned with milestone-based delivery. I use Python (Pandas) and Excel for deduplication, schema mapping, and canonical normalization before exporting clean JSON/CSV datasets. Looking forward to hearing from you. Best regards, Md Ruhul Ajom
$80 CAD in 3 days
4.4
4.4

Hello, I've worked extensively with Apify actors including Web Scraper and Reddit scrapers for targeted data extraction projects involving professional forums and community platforms. I've configured actor setups directly in client workspaces, executed search query batches across multiple sources, and built post-run filtering pipelines to isolate high-signal records based on engagement metrics and keyword relevance. For your 200-300 record scope, I'd propose setting up normalized output schemas upfront so the exported data aligns cleanly with your downstream needs. Let's discuss!
$150 CAD in 3 days
3.2
3.2

CANONICAL Hello. The biggest headache with targeted data extraction is usually ensuring the data is clean and truly "high-signal" without manual re-verification. I solve this by setting up precise filtering rules directly within the scraping process and then rigorously cleaning post-extraction. I have strong experience with APIFY actors, including Web Scraper and Reddit scrapers. I'll configure these directly in your workspace, execute the predefined queries, and apply the specific post-run filters you've outlined for verified credential flairs. For data normalization, I use PYTHON and PANDAS to handle deduplication and mapping synonym terms to canonical names, so your final JSON and CSV outputs are perfectly clean and ready. I can complete this project, including setting up both milestones, for $180 CAD within 1-3 days. Message me if you'd like to quickly review the structure of the initial search queries. Best regards, Yevhen.
$180 CAD in 1 day
3.0
3.0

Hi, I will configure and execute the Apify tools in your workspace to extract and normalize the targeted data from the specified forums and platforms. With extensive experience using Apify, I have successfully handled similar data extraction projects and can efficiently set up the Web Scraper and Reddit scrapers to meet your requirements. For data normalization, I typically use Python and Pandas to ensure the dataset is clean, deduplicated, and standardized. This approach guarantees that records align with your specified credential flairs and that synonym terms are mapped accurately. I am confident in executing the runs according to your provided search queries and applying strict filtering criteria for quality assurance. I am ready to start immediately, and I agree with your milestone structure for managing deliverables. Let’s ensure we meet your goals efficiently and effectively. Thank you.
$156.50 CAD in 7 days
2.8
2.8

Targeted extraction usually takes a couple of hours once I know the site, so the details here matter more than the scope. I run production scrapers for my own projects and client work, using Playwright, Apify actors, and custom Python depending on what the target needs. For Apify specifically, I know which store actors (Web Scraper, Cheerio Scraper, Puppeteer Scraper) suit which site types, and when it's worth writing a custom actor vs configuring an existing one. On small jobs I keep an eye on actor unit consumption so you don't burn through your plan on a 500-row extract. What I'd deliver: - Target site scoped, extraction logic built and tested - Output in CSV, JSON, or pushed to Google Sheets/webhook - Clean field mapping, no junk columns - Notes on re-running if you need it again later 3 days, 220 CAD. Realistically faster once I see the site. Before I scope this properly: what site(s) are you extracting from and which fields do you need? One-off pull or something you'd want on a recurring schedule? And what format works best for wherever the data is going?
$220 CAD in 3 days
3.0
3.0

Longueuil, Canada
Payment method verified
Member since Mar 12, 2025
$30-250 CAD
$30-250 CAD
$250-750 CAD
$30-250 CAD
$30-250 USD
₹100-400 INR / hour
₹12500-37500 INR
$30-250 USD
€30-250 EUR
$30-250 USD
$8-15 USD / hour
$15-25 USD / hour
$30-250 AUD
₹400-750 INR / hour
₹750-1250 INR / hour
$250-750 USD
₹12500-37500 INR
$5000-10000 CAD
₹750-1250 INR / hour
$10-30 USD
$14-60 NZD
$30-250 USD
€30-250 EUR
₹37500-75000 INR