The following requirements may be excessive. After reading the plan and your vision for this project, we are open to suggestions, proposals, and other possibilities for creating this project.
General requirements
1. Knowledge of Python and methods for scraping websites
2. Basic knowledge of HTML and CSS
3. Knowledge or willingness to learn libraries such as: requests, sqlite, beautifulsoup4, selenium
4. Basic knowledge of version control system git
5. Basic knowledge of cloud technologies
Specific requirements:
1. Preparation of scrapers in Python (bots reading news pages) for the listed portals using the BeautifulSoup4 library or similar
2. Saving the downloaded content in JSON format (or in an SQLite database)
3. Automating the submission of the aforementioned content to ChatGPT via API along with the appropriate prompt. The query should be parameterized so that the model returns only the modified content without additional output.
4. Saving the generated content in JSON format or in an SQLite database
5. Automating the submission of content via API to a WordPress-based website (with the post status as draft)
6. Automating the program's operation via cron/Windows Tasks Scheduler/launchd or another scheduler
7. Optionally: containerization of the program in Docker
8. In the future: deploying the application in a cloud service (Heroku, AWS, GCP, or another)
Plan
This is solely a vision of operation; we are open to changes, proposals, and suggestions. It starts with a Python script that scrapes news from websites. This is not technically difficult but can be tedious because you have to dig into the code of those pages; sometimes something changes on the page (e.g., the name of a section or a class that holds that content in the page's code). So besides writing these scrapers, they will need to be maintained later - support from time to time in case of problems.
Once the scraper retrieves the content, it should ideally save it in a file or some simple database (SQLite) in a standard format - JSON is fine because it will be easier to send the data to Chat or some other AI. Data is sent via API, and I suspect that JSON will be the most convenient. From what I've seen, there's nothing difficult here; you just need to define in the request settings sent to Chat that it shouldn't add its own introductions but just spit out the summary of the content. It will also likely do this in JSON format, and it would be good to save that response in some file as well.
The last step is to send this to WordPress via API. I saw that you can set the post status (e.g., draft), which I think is a good option because before publication, you can review it to ensure there are no mistakes. But once it runs smoothly, you can change the status to publish automatically.
Regarding automation, at first, I would suggest running it manually once a day or scheduling it on your laptop - both Windows and macOS have such schedulers, so you can run it once a day. I think it will be worth uploading it to the cloud after verifying how it runs to avoid incurring costs too early. From quick research, Heroku might be a good option because they have many conveniences for such simple programs.
-
28 days2171 USD28 days2171 USD
Good morning. I have all the necessary skills mentioned in the technical specification. I guarantee the quality of the order execution. I will be pleased to cooperate.
-
7 days2171 USD
350 12 1 1 7 days2171 USDHello! I am a Python developer with 5 years of experience. I have experience with libraries such as requests, BeautifulSoup4, and Selenium, as well as in configuring automation through cron and Docker. AWS, ChatGPT, Langchain. The price is about $30 per hour, and the complexity of the task depends on the difficulty of the website. In addition, my girlfriend is a Polish language teacher with a C1 level and has recently started learning Python. She will gladly help with this project, which means you will gain an experienced programmer and excellent understanding of the Polish language at the same time.
I am waiting for the technical task and hope for a long-term collaboration!
-
7 days2171 USD
603 4 0 7 days2171 USDGreat, I just have experience with Python scrapers (one is in my portfolio). Django, Selenium, Beautiful Soup, Postgres, Mongo DB are my technologies, I also have experience in bypassing speed limits on websites. Ready to work. We can discuss the details about the terms and price, I assure further project support, I am open to a longer collaboration. I do not speak Polish, but I read it, I have English at a B1 level.
-
7 days2171 USD
1296 26 1 1 7 days2171 USDGood morning.
I have a question, is the use of Python necessary for this project?
I am implementing scrapers using Node.js + Puppeteer. If that would be OK, I can offer you to implement it in this stack.
I have quite a lot of experience in scraping many different sites. My overall experience in web programming is about 8 years.
Besides just the scraper, I can also propose to create a small web application for managing the scraper and, for example, reviewing results, logs, etc. The web app can be made in React.js or Vue.js and run together with the scraper. It will work equally well on Linux, Windows, MacOS.
So, I invite you to discuss the details and further cooperation.
-
7 days2171 USD
263 7 days2171 USDGood day Jakub S, unfortunately, I do not speak Polish, but I possess all the skills you need. I have experience in automation and parsing, as well as programming in Python, and I am also familiar with the technology for creating projects on WordPress.
-
14 days2171 USD
580 14 2 14 days2171 USDGood morning, it would be nice if you could specify which websites need to be scraped, as this is not always a trivial matter. BeautifulSoup or Scrapy do not always work because there are Cloudflare blocks, which complicates things. Plus, we need to plan the architecture so that the parsing doesn't crash at the first possible opportunity. I invite you to get in touch if you haven't chosen a contractor yet. We will discuss the technical details and deadlines, as the budget is approximately known.
-
1 day2170 USD
1970 25 1 1 day2170 USDHello, unfortunately I do not speak Polish, but I have a colleague who can help with this. I am familiar with all the technologies you need, so I believe it will not be difficult to handle this task. I have extensive experience in parsing various resources, from very simple to complex. Write to me, we will discuss everything, I think we can also lower the price.
-
7 days2171 USD
1993 12 0 7 days2171 USDGood day
I have a ready news scraper with automatic publication
I can implement such a scraper for you without any problems, according to your requirements
Write to me
-
30 days2171 USD
329 6 0 30 days2171 USDGood morning, I am interested in your project. I have extensive experience in creating parsers for websites and APIs. I am well-versed in parsing libraries, sqlite3, git, and I have worked with cloud services. I can complete your project at a high level. If you have any questions, please feel free to write.
-
7 days2171 USD
852 15 4 7 days2171 USDHello,
I am a developer with 3 years of experience. I will handle your tasks related to data scraping. I use Python (bs4, requests, and selenium). I also work in other languages. You can write to me, and we will talk.
Best regards, Maksim.
-
10 days2171 USD
4097 5 1 10 days2171 USDHello, Jakub.
Thank you for the details.
I read your requirements carefully and understood everything.
As a senior full stack developer with 10 years of experience in Python and WordPress as well as web scraping using this great language, I am confident that I can perfectly execute your project and deliver it on time.
Additionally, I have extensive experience in integrating with the ChatGPT API.
I think my last project is very similar to yours.
Its goal is simply web scraping and analyzing HTML and CSS content and selecting the necessary data, then sending a request to ChatGPT via the API and retrieving it in JSON format.
It works like a bot and supports Windows and MacOS systems.
I joined this development as a senior full stack developer and managed all aspects of the development process, including Git version management.
… I agree with your opinion about using Heroku.
I can perfectly execute your project.
I would like to discuss it with you.
Thank you.
-
14 days2171 USD
577 7 0 14 days2171 USDHello,
I have skills in data scraping using beautifulsoup4 from Python and Node.js (it works the same) as well as databases and script automation.
I prefer to save in a database, but this is a flexible matter and practice shows what is best in a given situation.
To determine the exact price, we need to talk and discuss the details.
The project, from what I see, is multi-stage, so payments would also be made in stages.
Polish is my native language.
… Feel free to contact me.
Best regards,
Korneliia
-
Dzień dobry, z czego wynika budżet 8000zł, czy jest to celowe, i projekt jest na tyle skomplikowany, czy przypadek ?
-
do scrapowania byłoby 5 stron w pierwszej fazie projektu. Jeśli wszystko będzie działać będzie praca do powielenia na następnie około 20 stron.
-
Current freelance projects in the category PHP
Execution of work after SEO audit
334 USD
An experienced OpenCart developer is needed to perform SEO and technical improvements for the online store. Main tasks: Correction of the internal linking structure and menu. Adding links to the footer. Implementation of breadcrumbs with Schema.org microdata. Fixing the… PHP, Website Maintenance ∙ 2 days 10 hours back ∙ 59 proposals |
Fix issues with Facebook API in the OpenCart moduleIn OpenCart, there is a module for integrating Facebook and Instagram via API, OAuth, and Webhook. After opening the module page in the browser, the number of API requests to Facebook starts to increase, and the number of errors gr:get:InvalidID also rises. It is necessary to… PHP, Web Programming ∙ 2 days 11 hours back ∙ 39 proposals |
Integration needed: KeyCRM → Cash Register KashalotIt is necessary to set up integration between KeyCRM and the Cash Register Kahalot. When placing an order in KeyCRM, the data must be automatically transmitted to Kahalot: • order information • products, nomenclature • prices • quantity More details in private. Content Management Systems, PHP ∙ 5 days 6 hours back ∙ 29 proposals |
A WordPress site using the Kadence theme and Kadence Blocks.
223 USD
We need to create a website on WordPress using the Kadence theme and Kadence Blocks. There will be no online store (although it may be added in the future). We need a homepage and several internal pages. The graphics are already prepared, the layout structure is mostly defined,… Content Management Systems, PHP ∙ 6 days 16 hours back ∙ 38 proposals |
Development of 2 SEO-oriented websites for selling spare parts (ATVs and special equipment)Development of Two Specialized Websites for Selling Spare PartsGeneral Information It is necessary to develop two specialized websites: Spare parts for ATVs, UTVs, SSVs, and other similar equipment. Spare parts for special equipment. Existing company website:… PHP, Web Programming ∙ 7 days 13 hours back ∙ 77 proposals |