Budget: 3000 UAH Deadline: 2 days
добрый день, буду рад помочь.
Алгоритм должен включать описания:
1. Как реализуется охват всех объявлений. Можно предусмотреть в интерфейсе выбор региона от куда парсить например: Москва выбираем парсинг всех категорий или выборочно, что нас интересует, сохраняет в базу данных которая хранится на сервере (MongoDb или MySql) чтобы потом можно было парсить только новые объявления, объявления буду хранится по категориям.
2. Как происходит взятие контента объявлений - например https://www.avito.ru/moskva
с этой страницы достаем все ссылки на объявления и каждую парсим по очередно, или https://www.avito.ru/moskva/lichnye_veschi , такая ссылка позволяет парсить по категориям. Телефоны на авито хранятся в base64 png, качаем их храним в базе, с подвязкой к каждому объявлению
3. Как программа защищается от бана прокси, сколько нужно прокси - прокси + создание ожидания перед отправкой следующего запроса: например каждый регион или категория парсятся в отдельном потоке и используют разные прокси сервера. Дальше идет подмена прокси, и дальнейший парсинг уже объявлений. По колличеству прокси не знаю, может получится парсить без банов... нужно тестировать
4. Какие технологии (язык программирования, база данных) - язык С++ интерфейс Qt creator, база данных MongoDb или MySql.
5. На чем лучше запускать - серверное ПО или настольный ПК, и почему. - лучше серверное оно работает без перебоев + можно парсер настроить один раз и он будет сам себе работать иногда заходить проверять нет ли там ошибок, а с настольным могут быть разные отключения, а форсмажор и с сервером может быть.
6. Архитектура сервера/настольного ПК. Сервер - Window Server 2012 R2, RAM 1gb, system type 64 or 32. Настольный : Win7-8-10, RAM 1gb system 64 or 32.
7. Какая ожидается скорость набора 1 млн объявлений (и обосновать - почему).
6 - 8 объявлений, если на компьютере 4 ядра, запускаем в 4 потока, получится в час 44тыс объявлений в час, через 22 часа будет 1 миллион. Это тоже не точно, может в минуту можно парсить больше чем 6-8, если парсить 30 в минуту чисто теоритически 1 час в одном потоке 43тыс, если 4 потока 172 800.
Budget: 500 UAH Deadline: 1 day
Готов расписать
Proposals concealed
Proposals are currently absent
Current freelance projects in the category Data Parsing
Task: Parsing participants from 16 Viber groups There are 16 Viber groups where the possibility of parsing participants is available. Required: Obtain a list of participants from each Viber group. Compile the data into one Google Sheet or Excel file. If possible, save the following fields: group name; participant's name; phone number, if available; Viber ID / username, if available; date of parsing; from which group the contact was taken. Remove duplicates among all 16 groups. Show separately: the total number of collected participants; the number of unique contacts; the number of duplicates; the groups from which data was collected / not collected. Provide a final file with a clean contact database. Main condition: Parsing must be done carefully, without blocking accounts, groups, or losing access to Viber groups.
Task: You need to perform a test parsing and collect a database of Telegram channels in the niche of trading and cryptocurrency. Requirements for channels: • Theme: trading, futures, scalping, signals, smart money, trading bots, trading education, cryptocurrency, BTC, ETH, etc. • Language: Russian and Ukrainian. • Number of subscribers: 300 - 8000. • Channels can be either public or private (with application submission). • It is preferable to pay attention to live audience, not inflated subscribers. Keywords for search: Trading, Futures, Long, Short, Scalping, Fibonacci, Indicators, Trading Bot, Smart Money, Bitcoin, Crypto, Trading Strategy, How to earn in crypto, and other relevant terms. What is needed as a result (test phase): • A table in Excel/Google Sheets for approximately 200–300 channels. • Columns: • Link to the channel • Channel name • Number of subscribers • Approximate assessment of the live audience (if possible) • Growth over the last month (if possible) • Type of channel (public/private) • Link to chat (if available) • Brief theme (for example: signals, news, education, scalping, reviews, etc.) If the result of the test parsing satisfies me, we will continue with a larger volume (500-700 channels). Budget: negotiable Deadline: 2–4 days (Possibly more if more time is needed) Work through a secure deal. Examples of previous work on parsing Telegram channels are welcome.
A specialist is needed to collect and structure open information about sellers from marketplaces. It is necessary to determine the possibility of automated data collection and to form a database of sellers. In your response, please indicate: which marketplaces you have experience working with; what data you can obtain (seller name, link, categories, rating, number of products, other available fields); examples of similar projects.
Technical task Project Configuration of filling and synchronization of two Prom.ua stores with suppliers of auto parts. Task It is necessary to implement the loading and updating of products from auto parts suppliers for two online stores on Prom.ua. ⸻ 1. Connecting suppliers It is necessary to connect suppliers through: Supplier API; XML, CSV, XLS price lists; or another available method of obtaining products from the supplier's website. It is important to ensure complete synchronization of products between the supplier and the Prom.ua stores. ⸻ 2. Filtering and selection of products It is necessary to implement the ability to select products during import based on the following parameters: Car brand; Category of parts; Subcategory of parts; Other available characteristics. Example: For each store, there should be the ability to separately determine which categories of products and which car brands need to be loaded. Additional requirements for product selection It is necessary to implement the ability to select products during import not only by car brands and categories of parts but also by product availability status. There should be the ability to configure the following scenarios: Import only products that are in stock with the supplier; Do not import products with the status "out of stock"; Disable or remove products from Prom.ua after they are out of stock with the supplier. During pricing configuration, there should be the ability to combine filters: By car brand; By category of parts; By subcategory; By product availability. Example: Import only parts for Volkswagen and Audi, category "Braking system", that are in stock with the supplier. It is also necessary to implement a mechanism to prevent duplication of products from different suppliers. If the same product is present with multiple suppliers, only one product record should be imported into the catalog. Criteria for selecting a product when duplicates are detected: Priority is given to the product that is in stock with the supplier; If the product is in stock with several suppliers, priority is given to the product with the lowest price; If the cheapest product is out of stock, the system should choose the cheapest product among those that are in stock; Duplicate products from other suppliers should not create separate entries in the catalog. Example: Import only parts for Volkswagen and Audi, category "Braking system", that are in stock with the supplier. If the same part is available from several suppliers, only one entry is imported into the catalog — from the supplier with the lowest price among those who have the product in stock. ⸻ 3. Import of product cards During import, the following should be automatically loaded: Product name; Article; Photos; Product description; Price; Product characteristics; Manufacturer; Other available parameters. ⸻ 4. Updates It is necessary to set up automatic: Price updates when changed by the supplier ⸻ 5. Removal of unavailable products Products that are no longer in stock with the supplier should: Be disabled; or Be removed from Prom.ua (by agreement). ⸻ 6. Filling the stores It is necessary to: Create a category structure; Create subcategories; Correctly distribute products across categories; Check the correctness of product import. ⸻ 7. Work results After the work is completed, there should be: Suppliers connected; Product import configured; Price updates configured; New product addition configured; Disabling or removal of unavailable products configured; Prom.ua stores fully filled and ready for operation.
A Telegram bot is needed for automatic searching and monitoring of "BUY IT NOW" cars at auctions in the USA (Copart, IAAI). The bot should operate automatically and send notifications about new cars that meet the specified filters.Main functionalityFilter settings: 1. Car brand; 2. Model; 3. Year of manufacture (from/to); 4. Fuel type; 5. Engine volume; 6. Mileage; 7. Price range; Bot functions: 1. Automatic monitoring of new lots; 2. Checking for updates every 1-2 minutes; 3. Protection against duplicate notifications (anti-duplicate); 4. Ability to add and remove filters through the bot menu; 5. Saving settings of already existing car searches. Message format: 1. Photo of the car (4 photos); 2. Title and lot number; 3. Year of manufacture; 4. Mileage; 5. Engine type and volume; 6. Buy it now price; 7. Link to the lot.