AI/NLP engineer with strong healthcare document processing experience.
We are looking for an experienced AI/NLP engineer with strong healthcare document processing experience.
Our project involves recognition, parsing, and structured analysis of medical documents, especially SNF / long-term care documents and MDS assessments. The goal is to extract clinical and administrative information from scanned PDFs, faxed documents, and digital medical records, then convert it into validated structured JSON for our healthcare platform.
Required experience:
* OCR and document AI for scanned PDFs and medical records
* LLM-based information extraction
* Clinical NLP / healthcare NLP
* Medical document classification
* Table and form extraction
* JSON schema extraction
* Confidence scoring and source references
* HIPAA-aware architecture
* Python backend experience
* Experience with tools such as AWS Textract, Google Document AI, Azure AI Document Intelligence, LlamaParse, Tesseract, John Snow Labs, LangChain, or LlamaIndex
Nice to have:
* Experience with SNF, long-term care, MDS 3.0, PDPM, EHR documents, or healthcare claims
* Experience building human-in-the-loop review workflows
* Experience with de-identification of PHI
First task:
We will provide a small set of de-identified sample documents. The candidate should build a proof of concept that:
1. Recognizes the document type
2. Extracts key fields into JSON
3. Provides confidence scores
4. Shows source page/text references
5. Marks missing or uncertain fields instead of guessing
Please send examples of similar healthcare document AI / OCR / NLP projects you have built.
-
2044 23 0 Hello! Are you already using ready-made models for classifying types of medical documents, or do you plan to train your own from scratch?
I will discuss the deadlines and budget more precisely in personal correspondence.
Here’s how I will execute this project:
1. I will deploy an OCR pipeline (AWS Textract or LlamaParse) to extract text from PDFs and faxes.
2. I will apply an LLM (for example, LangChain) to parse clinical fields into structured JSON with confidence scoring.
3. I will add validation with source references and automatic tagging of missing fields.
… Thank you for considering my proposal. I look forward to the opportunity to collaborate with you!
-
196 we already have an almost ready healthcare document ai pipeline that can be adapted quickly for your poc, and i am online here to discuss the sample set now (:
for the first task, i estimate 10 days and 2500 usd for a controlled poc - document type recognition, ocr, field extraction to json, confidence scores, source page references, and missing-field handling without guessing.
similar healthcare and ai work
- https://business.ingello.com/rapport - healthcare process automation and structured clinical workflow logic
- https://business.ingello.com/lita-doctor - medical platform experience with doctor-side workflows and structured records
- https://business.ingello.com/vorfahr - ai automation case, relevant for extraction pipelines and agent-based processing
… AI extraction should be built as separate layers - ocr, document classification, schema extraction, validation, confidence scoring, and review of uncertain fields.
i would use python backend, aws textract or google document ai where useful, and llm extraction with strict schemas, source anchoring, and no free guessing.
for hipaa-aware handling, i would keep storage, access control, audit logs, and de-identification separated before model processing where required.
two quick questions before i lock the estimate more tightly
- how many sample document types are in the poc set - mds, faxed snf notes, claims, care plans, or something else
- do you already have target json schemas, or should we define them from the documents
our flh page - https://systems-fl.ingello.com
i can start with poc architecture and a first extraction protype after receiving the de-identified samples... small note, clinical document ai usually looks smaller on paper than it becomes in production =/
-
1510 10 0 We have experience in processing medical documents and implementing NLP solutions for extracting structured data from complex reports, including MDS and long-term care forms. We achieve this through custom OCR pipelines and LLM models tailored to the specifics of medical terminology to ensure high parsing accuracy. We are ready to discuss the details of integration into your system.
-
2506 20 0 Good day, I am ready to complete your task quickly and efficiently. I have extensive experience in creating various parsers. Please write to me in private messages to discuss the details. I would be happy to help :)
-
Ask your question to the client