Langchain + TypeScript + Openai
1 .Introduction to:
The project consists in creating a new code based on the existing code on TypeScript (scripts/ingest-web.ts) on the basis of langchain.The new code should be able to create a base of embeddings on the text sections and also ensure that this base is relevant when re-started.The 2ndReview of Requests:
The software must have the following functions and capabilities:
- Cyclic passage through all text sections available at https://www.uscis.gov/policy-manual/table-of-contents.
Create embeddings from the text of each section and record them into a vector database.Check the number of tokens in the text.If the number of tokens exceeds 15,000, then the embeddings must be divided into parts, each of which will contain no more than 15,000 tokens.Creating a hash from the text or using another corresponding method of comparing the text when re-start.If the text has changed, the signatures must be re-generated.3 .Architecture and Components:
The software must have the following structure and basic components:
- Main component: the ingest-web.ts file, modified to implement new requirements.- Database component: a vector database in which embeddings will be recorded.and 4.The Interface:
Software must interact with external systems and third-party programs as follows:
- Interact with the website at https://www.uscis.gov/policy-manual/table-of-contents to get the text of sections.Integration with the database for embeddings recording.and 5.The Security:
The software must meet the following security requirements and data protection mechanisms:
Protection of data stored in the vector database from unauthorized access.- Protection against possible vulnerabilities and attacks, such as injections or overfillment of the buffer.6 .The Testing:
The software must be tested using the following test plan:
- Automated testing to verify the creation of embeddings and their recording into the database.- Testing productivity to assess the time required to create embeddings and record them into the database.7 .Risk and Project Management:
There are the following possible risks and ways to manage them:
- Risk: Change the structure of the website by linking https://www.uscis.gov/policy-manual/table-of-contents.
Management: Regular monitoring of changes in the page structure and update the code according to your needs.- Risk: Data security infringement of a vector database.Management: Use data protection mechanisms such as encryption and access authorization.8thResources and Schedule:
For the implementation of the project, the following resources are available:
- A team of developers, which includes developers, testers
Please specify if you have any questions or need additional information.
-
Arsen Gutsal SOFTSKY
Яка саме база даних повинна використовуватись?
Команда програмістів і тестувальників за 5000 грн. Ви це серйозно?
-
Current freelance projects in the category Javascript and Typescript
Full-stack development — Amazon PPC Dashboard (Stage 1)Need a full-stack developer with experience working with Amazon API to implement Stage 1 of the internal PPC dashboard. The project is real, the data is live, everything is ready to start. Frontend prototype (5 pages, React + TypeScript): WHAT IS ALREADY READY — Frontend… Javascript and Typescript, Web Programming ∙ 2 days 2 hours back ∙ 28 proposals |
Development of a photo book and photo frame constructor websiteProject Goal A modern website needs to be developed for ordering photo books, photo frames, and other personalized photo products. The main task of the project is to provide the client with the ability to independently create a ready-made layout of the product directly on the… Javascript and Typescript, Web Programming ∙ 3 days 18 hours back ∙ 96 proposals |
Team for a custom marketplace of funeral services.We are looking for a team to launch and develop a custom national marketplace for funeral services. Right away: we are not looking for solo freelancers, juniors, or "website builders." We need a strong product team at the middle+/senior level with real cases in… Javascript and Typescript, Web Programming ∙ 4 days 12 hours back ∙ 27 proposals |
Improvement of the existing Next.js/Supabase project: offers, CRM, analytics, AI chatThere is an active project WatchGenius — a luxury watch analytics platform with a catalog of models, price analytics, external offers, application forms, and an AI chat. The project has already been partially developed. We need not a website from scratch, but an experienced… Javascript and Typescript, Web Programming ∙ 6 days 14 hours back ∙ 55 proposals |
I am looking for a Senior Full-Stack Developer (Payload CMS) — migration of 2 websites for a hotel chain.The resort hotel network is transitioning from Webflow to its own headless stack. Two corporate websites (~140 pages in total), multi-tenant architecture — one code serves multiple hotels on different domains. The project has already started: the infrastructure is deployed,… Content Management Systems, Javascript and Typescript ∙ 10 days 12 hours back ∙ 19 proposals |