나는 웹 크롤링을 위해 셀레니움을 사용하고자 한다.
I'm going to use selenium for web crawling.
1. Import all the necessary packages.
import time
import openpyxl
from openpyxl import Workbook
import random
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
2. Web crawling
wb = Workbook(write_only=True)
ws1 = wb.create_sheet('except_date')
ws2 = wb.create_sheet('date')
ws1.append(['brand', 'product', 'review'])
ws2.append(['date'])
options = webdriver.ChromeOptions()
path = '/path/to/chromedriver'
driver = webdriver.Chrome(path, chrome_options = options)
driver.implicitly_wait(3)
driver.get('https://www.sephora.com/product/high-impact-lash-elevating-mascara-P421489?skuId=1968247&keyword=CLINIQUE%20High%20Impact%20Lash%20Elevating%20Mascara')
close_popup = driver.find_element_by_css_selector('svg.css-1ikgx7p.eanm77i0')
close_popup.click()
time.sleep(3)
driver.execute_script('window.scrollTo(0, 3050)')
time.sleep(1)
sort = driver.find_elements_by_css_selector('div.css-tsrkv7')[1]
sort.click()
most_helpful = driver.find_element_by_xpath('//button[@class="css-1aawth6 eanm77i0"][contains(text(), "Most Helpful")]')
most_helpful.click()
time.sleep(1)
i = 0
for i in range(14):
review_texts = driver.find_elements_by_xpath('//div[@class="css-1s11tbv eanm77i0"]')
dates = driver.find_elements_by_xpath('//span[@class="css-ak0g49 eanm77i0"]')
for review_text in review_texts:
review_t = review_text.text
ws1.append(['CLINIQUE', 'High Impact Lash Elevating Mascara', review_t])
for date in dates:
d = date.text
ws2.append([d])
next_page = driver.find_elements_by_css_selector('li.css-1579ltc')[8]
next_page.click()
time.sleep(random.uniform(2, 3.5))
i += 1
driver.quit()
wb.save('un4_h.xlsx')
위 코드의 결과: 아래의 첨부된 액셀 파일 참고
Open the attached excel document named un4_h.xlsx for the result of the code above.
* Unauthorized copying and distribution of this post are not allowed.
* 해당 글에 대한 무단 배포 및 복사를 허용하지 않습니다.
'Bachelor of Business Administration @PNU > Marketing Analytics' 카테고리의 다른 글
세포라 리뷰 웹 크롤링 - 2 | Sephora Review Web Crawling - 2 (24.02.2022.) (0) | 2022.04.14 |
---|---|
선행 연구 -1 | Pilot Study - 1 (24.02.2022.) (0) | 2022.04.14 |
프로젝트 프로포절 두 번째 수정본 | Second Revised Project Proposal (18.02.2022.) (0) | 2022.04.14 |
프로젝트 프로포절 수정본 | Revised Project Proposal (11.02.2022.) (0) | 2022.04.14 |
프로젝트 계획 | Project Plan (04.02.2022.) (0) | 2022.04.14 |