Написать MapReduce для сета Amazon товары (основы MapReduce)
20 USDИщу человека для выполнения основ Processing-a, а точнее написание MapReduce для листа с товарами и отзывами:
В фаиле, каждая строка имеет отзыв в формате JSON. И выглядят данные так:
- reviewerID -
string- the ID of the author of the review - asin -
string- unique product identifier - reviewerName -
string- name of the reviewer - helpful -
array of two integers [a,b]- helpfulness rating of the review:aout ofbcustomers found the review helpful - reviewText -
string- the content of the review; this is the text to be processed - overall -
float- rating given to product asin by reviewer reviewerID - summary -
string- the title of the review - unixReviewTime -
integer- timestamp of when review was created in UNIX format - reviewTime -
string- date when review was created in human readable format - category -
string- the category that the product belongs to
Что требуется выполнить:
Для подготовки файла к классификации, требуется выбрать "термины", которые различают классы. Поэтому требуется написать MapReduce, который вычисляет значения chi-square values в наборе данных.
Для этого потребуется сделать:
- Tokenization > !?,;:()[]{}-_"'`~#&*%$\/ as delimiters
- Case folding
- Stopword filtering > имеется отдельный фаил для этого
Сам MapReduce должен вычислять chi-square values для каждой категории / Сортировать понятия в соответствии с их значением (для первых топ 150 понятий) / Объедините списки по всем категориям
-
Вот пример строки:
{"reviewerID": "A2VNYWOPJ13AFP", "asin": "0981850006", "reviewerName": "Amazon Customer \"carringt0n\"", "helpful": [6, 7], "reviewText": "This was a gift for my other husband. He's making us things from it all the time and we love the food. Directions are simple, easy to read and interpret, and fun to make. We all love different kinds of cuisine and Raichlen provides recipes from everywhere along the barbecue trail as he calls it. Get it and just open a page. Have at it. You'll love the food and it has provided us with an insight into the culture that produced it. It's all about broadening horizons. Yum!!", "overall": 5.0, "summary": "Delish", "unixReviewTime": 1259798400, "reviewTime": "12 3, 2009", "category": "Patio_Lawn_and_Garde"} -
Current freelance projects in the category Data Parsing
Consultation on parsing Instagram account subscribersHello. It is necessary to conduct a preliminary assessment of the feasibility of the following task. I have a list of Instagram accounts. The goal is to obtain contact information (primarily email addresses) of users who follow these accounts. Previously, I encountered companies… Data Parsing ∙ 3 hours 23 minutes back ∙ 3 proposals |
A specialist is needed to find contacts of decision-makers in Ukraine.It is necessary to gather a database (or ready database) of contacts of decision-makers (DMs) in companies in Ukraine. Information Gathering, Data Parsing ∙ 7 hours 55 minutes back ∙ 8 proposals |
Need to scrape data from LinkedInWe need to scrape data from LinkedIn based on our list. For each entry, we need to find and collect available data if it exists on the LinkedIn profile, including the profile picture on the LinkedIn social network, email address, links to social media, company website, and… Data Parsing ∙ 13 hours 42 minutes back ∙ 19 proposals |
Parsing and classification of dataWe are looking for a developer to implement a system for collecting and structuring data from open sources. We have a database of small business owners in the USA, which contains the person's name, company name, address, and state. It is necessary to build a process for… Web Programming, Data Parsing ∙ 14 hours 51 minutes back ∙ 33 proposals |
Svitlahata
17 USD
It is necessary to import 1819 products from the XML/YML feed of Prom.ua to OpenCart 3. A ready XML file is available, which contains product names, descriptions, prices, photos, specifications, manufacturers, and categories. Requirements: import all products to OpenCart… Content Management Systems, Data Parsing ∙ 1 day 17 hours back ∙ 32 proposals |