diff options
| author | Aadniz <8147434+Aadniz@users.noreply.github.com> | 2025-11-06 07:00:48 +0100 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2025-11-06 07:00:48 +0100 |
| commit | b1918dd12110c183fc1fba1c51778a32e6cb4049 (patch) | |
| tree | 1a126f7d0eef5a6c56b204f67eb8b720887ebb43 | |
| parent | 1be19f8b5820d1c7b369f80cc48c6014a6d41085 (diff) | |
[fix] yandex engine: capture captcha from header instead of url path (#5417)
Yandex engine will return parsing error instead of informing that a CAPTCHA was found. It is confusing for the admin and the users (#5415).
This patch fixes an issue where the CAPTCHA response from Yandex wouldn't be detected, resulting in `ParserError` when trying to parse the response to DOM.
In this fix, I replaced the url condition and instead is checking if the `x-yandex-captcha` header is set, and is equal to `captcha`.
Alternatively, maybe something like `resp.headers.get('Location', '').startswith("https://yandex.com/showcaptcha")` could be done instead. Lastly, setting `params['allow_redirects'] = True` can also work, but this will waste an extra request. Just let me know.
Closes: https://github.com/searxng/searxng/issues/5415
| -rw-r--r-- | searx/engines/yandex.py | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/searx/engines/yandex.py b/searx/engines/yandex.py index 2c6984fdc..77b03067b 100644 --- a/searx/engines/yandex.py +++ b/searx/engines/yandex.py @@ -35,7 +35,7 @@ content_xpath = './/div[@class="b-serp-item__content"]//div[@class="b-serp-item_ def catch_bad_response(resp): - if resp.url.path.startswith('/showcaptcha'): + if resp.headers.get('x-yandex-captcha') == 'captcha': raise SearxEngineCaptchaException() |