This is an old revision of the document!
Table of Contents
Selenium
Nos bajamos la imagen de selenium con los drivers de chrome y de firefox:
Chrome: http://chromedriver.chromium.org/downloads
docker run -ti iwanttobefreak/selenium
Entramos en modo interactivo para probarlo:
ipython
Python 2.7.13 (default, Sep 26 2018, 18:42:22) Type "copyright", "credits" or "license" for more information. IPython 5.1.0 -- An enhanced Interactive Python. ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object', use 'object??' for extra details. In [1]:
Ahora podemos lanzar comandos:
from selenium import webdriver from selenium.webdriver.firefox.options import Options options = Options() options.add_argument("--headless") driver = webdriver.Firefox(options=options)
Por ejemplo vamos a buscar un tren en la web de Renfe:
url = 'http://www.renfe.com' driver.get(url)
Y grabamos la url en un fichero:
driver.save_screenshot('/tmp/selenium/renfe.png')
Ejemplo headless chrome con timeout en llamada get
from selenium import webdriver from selenium.webdriver.firefox.options import Options options = Options() options.add_argument("--headless") url1 = 'http://10.255.255.1' url1_timeout = 5 driver = webdriver.Firefox(firefox_options = options) driver.set_page_load_timeout(url1_timeout) driver.get(url1)
Ejemplo headless chrome con timeout en llamada get
from selenium import webdriver from selenium.webdriver.chrome.options import Options from selenium.webdriver.support import expected_conditions as EC options = Options() options.add_argument('--headless') options.add_argument('--no-sandbox') options.add_argument('--disable-dev-shm-usage') url1 = 'http://10.255.255.1' url1_timeout = 5 driver = webdriver.Chrome(desired_capabilities = options.to_capabilities()) driver.set_page_load_timeout(url1_timeout) driver.get(url1)
Python firefox remote headless
Por definición siempre que es remote es headless
from selenium.webdriver import Firefox, FirefoxProfile, Remote host = '172.30.10.18' port = '4444' host = selenium_hub_host port = selenium_hub_port url = f'{host}:{port}/wd/hub' d = Remote(command_executor = url, desired_capabilities = desired_capabilities) # MUY importante para evitar error: # selenium.common.exceptions.ElementClickInterceptedException: Message: Element <input id="projects_" name="projects[]" type="checkbox"> is not clickable at point (179,16) because another element <a href="/"> obscures it d.set_window_size(1920, 1080)
Python firefox remote
1. Arrancar el standalone
version: '3.7' services: hub: container_name: hub image: selenium/standalone-firefox
2. Obtner la IP de ese contenedor
3. Probar
from selenium import webdriver host = "192.168.3.44" port = "4444" desired_capabilities = { 'browserName': 'firefox', 'javascriptEnabled': True, } self.driver = webdriver.Remote(command_executor = host + ':' + port + '/wd/hub', desired_capabilities = desired_capabilities)
Python chrome headless local
IMPORTANTE: si no se especifica el tamaño de la pantalla puede no encontrar objetos en el DOM.
Ejemplo:
from selenium import webdriver from selenium.webdriver.chrome.options import Options CHROMEDRIVER_PATH = '/usr/local/bin/chromedriver' options = Options() options.add_argument('--headless') options.add_argument("window-size=1920,1080") driver = webdriver.Chrome(CHROMEDRIVER_PATH, chrome_options=options)
Python firefox headless local
Sin profile
IMPORTANTE: si no se especifica el tamaño de la pantalla puede no encontrar objetos en el DOM.
Ejemplo:
from selenium.webdriver.firefox.options import Options from selenium.webdriver import Firefox # Local headless options = Options() options.headless = True driver = Firefox(options = options)
Con profile
from selenium.webdriver.firefox.options import Options from selenium.webdriver import Firefox, FirefoxProfile # Local headless options = Options() options.headless = True driver = Firefox(firefox_profile = profile, options = options)
Python firefox headless local con profile y options
https://stackoverflow.com/a/52898225/11436137
from selenium import webdriver; from selenium.webdriver.firefox.options import Options cProfile = webdriver.FirefoxProfile(); dwnd_path = os.getcwd(); cProfile.add_preference('browser.download.folderList', '2'); cProfile.add_preference('browser.download.manager.showWhenStarting', 'false'); cProfile.add_preference('browser.download.dir', 'dwnd_path'); cProfile.add_preference('browser.helperApps.neverAsk.saveToDisk', 'application/octet-stream,application/vnd.ms-excel'); options = Options() options.headless = True driver = webdriver.Firefox(firefox_profile=cProfile, firefox_options=options, executable_path=r'C:\path\to\geckodriver.exe')
Errores
selenium.common.exceptions.ElementClickInterceptedException: Message: Element <input id="projects_" name="projects[]" type="checkbox"> is not clickable at point (179,16) because another element <a href="/"> obscures it
Causa: se ha iniciado un webdriver remoto sin especificar las dimensiones de la ventana
Solución:
driver.set_window_size(1920, 1080)
selenium.common.exceptions.WebDriverException: Message: Failed to decode response from marionette
Causa:
* Se ha especificado un tamaño de ventana con set_window_size() (probablemente es irrelevante) * Se ha quedado sin memoria la instancia de Firefox
Solución: especificar variable “shm_size”. Ejemplo docker-compose:
easyredmine-backup-selenium: container_name: easyredmine-backup-selenium image: selenium/standalone-firefox #restart: unless-stopped # Mandatory, to avoid "Message: Failed to decode response from marionette" error shm_size: ${SHM_SIZE} #environment: # - START_XVFB=False networks: network-easyredmine-backup: aliases: - easyredmine-backup-selenium volumes: - ${DOCKER_HOST_DOWNLOAD_DIR}:${DOCKER_CONTAINER_DOWNLOAD_DIR}
PROXY
Para firefox: webdriver.DesiredCapabilities.FIREFOX['proxy']
Para chrome: webdriver.DesiredCapabilities.CHROME['proxy']
from selenium import webdriver from selenium.webdriver.firefox.options import Options from selenium.webdriver.common.proxy import * PROXY = "172.17.0.1:3128" webdriver.DesiredCapabilities.FIREFOX['proxy'] = { "httpProxy":PROXY, "ftpProxy":PROXY, "sslProxy":PROXY, "proxyType":"MANUAL" } options = Options() options.add_argument("--headless") driver = webdriver.Firefox(options=options)
Sesiones
Firefox
Abrimos firefox en local. Escribimos en la barra about:profiles. Creamos un nuevo pofile y le asignamos un directorio. Lanzamos el profile y navegamos para que nos guarde información, por ejemplo el login de whatsapp.
Lanzamos selenium con la siguiente propiedad:
myprofile = webdriver.FirefoxProfile("<ruta a mi directorio de perfil>") driver = webdriver.Firefox(myprofile)
Chrome
Simplemente le tenemos que indicar un directorio y ya lo graba ahí. Funciona con Chromedriver 70 a 73. Lo podemos descargar de:
https://chromedriver.storage.googleapis.com/index.html
from selenium import webdriver from selenium.webdriver.chrome.options import Options CHROMEDRIVER_PATH = '/chromedriver70/chromedriver' options = Options() options.add_argument('user-data-dir=/selenium/session') options.add_argument("window-size=1920,1080") driver = webdriver.Chrome(CHROMEDRIVER_PATH, options=options)
Enviar mensaje
xpath = './/span[contains(@title, "Armando Bronca")]' o = driver.find_element_by_xpath(xpath) o.click() xpath = './/div[contains(@class, "_3u328 copyable-text selectable-text")]' o = driver.find_element_by_xpath(xpath) o.send_keys('Mensaje enviado desde selenium') xpath = './/button[contains(@class, "_3M-N-")]' o = driver.find_element_by_xpath(xpath) o.click()
Adjuntar fichero
xpath = './/span[contains(@title, "Armando Bronca")]' o = driver.find_element_by_xpath(xpath) o.click() xpath = './/div[contains(@title, "Adjuntar")]' o = driver.find_element_by_xpath(xpath) o.click() o = driver.find_element_by_xpath("//input[@type='file']") o.send_keys(os.getcwd()+"/tmp/caron.png") xpath = './/span[contains(@data-icon, "send-light")]' o = driver.find_element_by_xpath(xpath) o.click()
Padres e hijos
Tenemos el siguiente código:
<div class="floatingWindowDiv" style="position: absolute; z-index: 1001; width: 222px; visibility: visible; top: 444px; display: block; left: 678px;"> <div style="display: block; z-index: 400; position: static; width: 222px; left: 0px; top: 0px;" tabindex="-1"> <div style="display: none;" class="masterMenu DropDownLoading"> <span>Please Wait</span></div> <div class="masterMenu DropDownValueList" tabindex="-1" style="display: block; height: 150px;" loadingcompleted="true"> <div tabindex="-1" class="masterMenuItem promptMenuOption" title="(All Column Values)" style=""> <div class="promptDropdownNoBorderDiv"> <input name="saw_263580_c_1" class="checkboxRadioButton" value="*)nqgtac(*" id="saw_263580_c_1_ck0" aria-labelledby="saw_263580_c_1_ck0_cblabel" tabindex="-1" style="" type="checkbox"> <label class="checkboxRadioButtonLabel" for="saw_263580_c_1_ck0" id="saw_263580_c_1_ck0_cblabel">(All Column Values)</label> </div> </div> <div tabindex="-1" class="masterMenuItem promptMenuOption" title="NULL"> <div class="promptDropdownNoBorderDiv"> <input name="saw_263580_c_1" class="checkboxRadioButton" value="*)nqgtn(*" id="saw_263580_c_1_ck1" aria-labelledby="saw_263580_c_1_ck1_cblabel" tabindex="-1" type="checkbox"> <label class="checkboxRadioButtonLabel" for="saw_263580_c_1_ck1" id="saw_263580_c_1_ck1_cblabel" style="">NULL</label> </div> </div> <div tabindex="-1" class="masterMenuItem promptMenuOption" title="AMA DE CASA"> <div class="promptDropdownNoBorderDiv"> <input name="saw_263580_c_1" class="checkboxRadioButton" value="AMA DE CASA" id="saw_263580_c_1_ck2" aria-labelledby="saw_263580_c_1_ck2_cblabel" tabindex="-1" style="" type="checkbox"> <label class="checkboxRadioButtonLabel" for="saw_263580_c_1_ck2" id="saw_263580_c_1_ck2_cblabel">AMA DE CASA</label> </div> </div> <div tabindex="-1" class="masterMenuItem promptMenuOption" title="AUTONOMO CUENTA"> <div class="promptDropdownNoBorderDiv"> <input name="saw_263580_c_1" class="checkboxRadioButton" value="AUTONOMO CUENTA" id="saw_263580_c_1_ck3" aria-labelledby="saw_263580_c_1_ck3_cblabel" tabindex="-1" style="" type="checkbox"> <label class="checkboxRadioButtonLabel" for="saw_263580_c_1_ck3" id="saw_263580_c_1_ck3_cblabel">AUTONOMO CUENTA</label> </div> </div> </div> </div> </div>
Queremos seleccionar input pero dentro del div que tiene lel texto en title.
Primero seleccionamos el bloque padre, buscando por div y title que queramos:
menu_click='NULL' menu_click='AMA DE CASA' menu_click='CUENTA AJENA FIJO' xpath='//div[@title="' + menu_click + '"]' padre = driver.find_element_by_xpath(xpath)
Ahora dentro de esa caja, selecctionamos el input para hacer click
xpath2='.//input[@type="checkbox"]' hijo = padre.find_element_by_xpath(xpath2) hijo.click()