Find Jobs
Hire Freelancers

Set up PDF Parser with Python and OCR

$30-250 USD

In Progress
Posted about 6 years ago

$30-250 USD

Paid on delivery
Description 1) Develop an web page on CoreUI front end to upload a PDF file and a dropdown menu with list of 3 Forms (IRS Form 1120, IRS Form 1120S, IRS Form 1165) and a button Upload. 2) After the File uploads set up a basic Pre-Processing (Clean Up) and Parsing (OCR) of Page 1 of IRS Form 1120) including: a) Clean up of Image, such as: i) Turn to Black & White / Crop / Extract Lines (Tools: OpenCV, Numpy) ii) Set up coordinated “Sub-Image Areas” (e.g. Set Up Sub-Image Area for “output box” for 1a. Gross Receipts and Sales (See word file for full description and pictures) iii) Parse Sub Image Area with Tesseract OCR 3) Set up a Page with a PDF Viewer (e.g. [login to view URL]) and “Editable Output” (value from 1a. Gross Receipts and Sales) so that the user can confirm the Parsing 4) Add a Button Save to save to the database both the Original File, Cleaned up File, Parsed Text and Verified by user Text Tesseract OCR Documentation: [login to view URL] CoreUI framework documentation: [login to view URL] Other Tools: ● OpenCV: [login to view URL] ● Numpy: [login to view URL] ● PDF Miner: [login to view URL] ● Image Magick: [login to view URL] (Constrained Based Algorithm to figure out where the line is at)
Project ID: 16622809

About the project

4 proposals
Remote project
Active 6 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
4 freelancers are bidding on average $422 USD for this job
User Avatar
Dear,Sir How are you? I am very interested in your project and am ready for starting your project for now. I will work very hard and best for you. Best Regards
$155 USD in 3 days
5.0 (135 reviews)
8.5
8.5
User Avatar
A proposal has not yet been provided
$155 USD in 5 days
4.9 (67 reviews)
7.1
7.1
User Avatar
We have already done many projects related to OCR with tesseract and we have extensive experience with parsing PDF reports from pathology reports. Thus, we are extremely confident regarding this project. About us - we are a team of data scientists and developers with over 10 years of development experience in web/mobile/desktop applications in Java, Python, C++, Javascript. Please contact us over chat for further discussions.
$555 USD in 10 days
5.0 (53 reviews)
6.6
6.6
User Avatar
https://www.freelancer.com/projects/php/License-Plate-Detection-With-Chinese/ https://www.freelancer.com/projects/Python/Qualitative-Comparative-Study-Face-13933384/ We are team of developers worked in python, opencv, numpy, scipy, machine learning. lets discuss it over chat
$823 USD in 3 days
4.9 (42 reviews)
6.5
6.5

About the client

Flag of UNITED STATES
New York, United States
5.0
6
Payment method verified
Member since Jan 27, 2018

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759) & Freelancer Online India Private Limited (CIN U93000HR2011FTC043854)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.