summaryrefslogtreecommitdiff
path: root/searx/engines/google.py
AgeCommit message (Collapse)Author
2022-11-11Switch back to protobuf for raw HTMLÉmilien Devos
2022-11-11Fix Google search engine.ngosang
- Fix broken links. Resolves #1794 - Fix missing results. Resolves #1829
2022-09-27[fix] typos / reported by @kianmeng in searx PR-3366Markus Heiser
[PR-3366] https://github.com/searx/searx/pull/3366 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-10[fix] google - simplify XPath selectors to fetch more resultsMarkus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-10output format protobuf to HTML for google mobileÉmilien Devos
2022-08-10Revert PR #1633Brock Vojković
This reverts the changes made to the Google results XPath in PR #1633.
2022-08-09[fix] google engine: results XPathLéon Tiekötter
Seems google rolls out changes first on the `google.com` domain and later on the "language" domains. By example: yesterday [1] `google.com` did not work but `google.de` and `google.fr` did work, today they do not work any longer and this fix is needed on all domains. Closes: https://github.com/searxng/searxng/issues/1628 [1] https://github.com/searxng/searxng/issues/1628#issuecomment-1208191816
2022-08-01[mod] add 'Accept-Language' HTTP header to online processoresMarkus Heiser
Most engines that support languages (and regions) use the Accept-Language from the WEB browser to build a response that fits to the language (and region). - add new engine option: send_accept_language_header Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-26Revert "Quick fix for google engine for EU countries"Markus Heiser
This reverts commit 747cf1a246df587aeb3b6b175c315ef0b9612dc4.
2022-07-26[fix] google engine: results XPathLéon Tiekötter
2022-07-25Quick fix for google engine for EU countriesÉmilien Devos
This revert part of the commit of https://github.com/searxng/searxng/commit/5fb2071cb2248c0f0ada7affb0c47f841ddbf102
2022-07-25[fix] google & youtube - set EU consent cookieEmilien Devos
This change the previous bypass method for Google consent using ``ucbcb=1`` (6face215b8) to accept the consent using ``CONSENT=YES+``. The youtube_noapi and google have a similar API, at least for the consent[1]. Get CONSENT cookie from google reguest:: curl -i "https://www.google.com/search?q=time&tbm=isch" \ -A "Mozilla/5.0 (X11; Linux i686; rv:102.0) Gecko/20100101 Firefox/102.0" \ | grep -i consent ... location: https://consent.google.com/m?continue=https://www.google.com/search?q%3Dtime%26tbm%3Disch&gl=DE&m=0&pc=irp&uxe=eomtm&hl=en-US&src=1 set-cookie: CONSENT=PENDING+936; expires=Wed, 24-Jul-2024 11:26:20 GMT; path=/; domain=.google.com; Secure ... PENDING & YES [2]: Google change the way for consent about YouTube cookies agreement in EU countries. Instead of showing a popup in the website, YouTube redirects the user to a new webpage at consent.youtube.com domain ... Fix for this is to put a cookie CONSENT with YES+ value for every YouTube request [1] https://github.com/iv-org/invidious/pull/2207 [2] https://github.com/TeamNewPipe/NewPipeExtractor/issues/592 Closes: https://github.com/searxng/searxng/issues/1432
2022-07-09bypass google consent with ucbcb=1Emilien Devos
2022-05-10Reflect the real world parameter from settings.ymlÉmilien Devos
2022-02-09Update the XPath for fetching the Google resultsÉmilien Devos
2022-01-18[fix] googel engine - "some results are invalids: invalid content"Markus Heiser
Fix google issues listet in the `/stats?engine=google` and message:: some results are invalids: invalid content The log is:: DEBUG searx : result: invalid content: {'url': 'https://de.wikipedia.org/wiki/Foo', 'title': 'Foo - Wikipedia', 'content': None, 'engine': 'google'} WARNING searx.engines.google : ErrorContext('searx/search/processors/abstract.py', 111, 'result_container.extend(self.engine_name, search_results)', None, 'some results are invalids: invalid content', ()) True Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-18[fix] google engine: remove adds and fix mobile_ui selectorMarkus Heiser
1. Fix issue reported in comment [1] 2. Fix XPath selector for the response of google's mobile UI, reported in comment [2] [1] https://github.com/searxng/searxng/pull/777#issuecomment-1015121322 [2] https://github.com/searxng/searxng/pull/777#issuecomment-1015236238 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-17Update XPath for Google engineÉmilien Devos
2022-01-05[enh] add more categoriesMartin Fischer
2021-12-27[format.python] initial formatting of the python codeMarkus Heiser
This patch was generated by black [1]:: make format.python [1] https://github.com/psf/black Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-11-25[fix] google engine - suggestionMarkus Heiser
BTW: google no longer offers *spelling suggestions* Closes: https://github.com/searxng/searxng/issues/442 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-07[pylint] engines: drop no longer needed 'missing-function-docstring'Markus Heiser
Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914168470 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-07[fix] drop useless pylint: disable=undefined-variableMarkus Heiser
Since 7b235a1 (see line 591) it is no longer needed to disable 'undefined-variable' for names defined in:: PYLINT_ADDITIONAL_BUILTINS_FOR_ENGINES Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914068609 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-06[mod] one logger per engine - drop obsolete logger.getChildMarkus Heiser
Remove the no longer needed `logger = logger.getChild(...)` from engines. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-08-21[enh] google: add filter=0 to Google engine for more resultsNoémi Ványi
backport from searx ( 23b3b56a06ef831af0a1b30a12c26ebd50e329bb )
2021-07-15Add missing parameter for mobile UI searchÉmilien Devos
2021-06-21[docs] add documentation from the sources of the google enginesMarkus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-21[fix] google answers: normalize space of the answers.Markus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-21[mod] google engine: reduce mobile UI parameters to what is neededMarkus Heiser
Reverse engineering shows that not all of the parameters used by google's mobile UI (aka "more results" button) are needed [1]. [1] https://github.com/searxng/searxng/pull/160#issuecomment-865013625 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-21[mod] google: add "use_mobile_ui" parameter to use mobile endpoint.Alexandre Flament
disable by default, it has to be enabled in settings.yml related to #159
2021-06-11[mod] google - get_lang_info add documentataion & commentsMarkus Heiser
BTW: remove obsolete log messages from google engine Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-10[enh] google engine: supports "default language"Alexandre Flament
Same behaviour behaviour than Whoogle [1]. Only the google engine with the "Default language" choice "(all)"" is changed by this patch. When searching for a locate place, the result are in the expect language, without missing results [2]: > When a language is not specified, the language interpretation is left up to > Google to decide how the search results should be delivered. The query parameters are copied from Whoogle. With the ``all`` language: - add parameter ``source=lnt`` - don't use parameter ``lr`` - don't add a ``Accept-Language`` HTTP header. The new signature of function ``get_lang_info()`` is: lang_info = get_lang_info(params, lang_list, custom_aliases, supported_any_language) Argument ``supported_any_language`` is True for google.py and False for the other google engines. With this patch the function now returns: - query parameters: ``lang_info['params']`` - HTTP headers: ``lang_info['headers']`` - and as before this patch: - ``lang_info['subdomain']`` - ``lang_info['country']`` - ``lang_info['language']`` [1] https://github.com/benbusby/whoogle-search [2] https://github.com/benbusby/whoogle-search/releases/tag/v0.5.4
2021-04-26[pylint] tag PYLINT_FILES by comment `# lint: pylint`Markus Heiser
These py files are linted by `test.pylint`, all other files are linted by `test.pep8`. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-04-19Merge remote-tracking branch 'searx/master'Alexandre Flament
2021-04-11Fix grammar mistake in debug log outputRobin Schneider
2021-04-10[enh] replace requests by httpxAlexandre Flament
2021-02-01[mod] dynamically set language_support variableAlexandre Flament
The language_support variable is set to True by default, and set to False in only 5 engines. Except the documentation and the /config URL, this variable is not used. This commit remove the variable definition in the engines, and set value according to supported_languages length: False when the length is 0, True otherwise. Close #2485
2021-01-28[fix] google: avoid unnecessary SearxEngineXPathException errorsMarkus Heiser
Avoid SearxEngineXPathException errors when parsing non valid results:: .//div[@class="yuRUbf"]//a/@href index 0 not found Traceback (most recent call last): File "./searx/engines/google.py", line 274, in response url = eval_xpath_getindex(result, href_xpath, 0) File "./searx/searx/utils.py", line 608, in eval_xpath_getindex raise SearxEngineXPathException(xpath_spec, 'index ' + str(index) + ' not found') searx.exceptions.SearxEngineXPathException: .//div[@class="yuRUbf"]//a/@href index 0 not found Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28[fix] normalize the language & region aspects of all google enginesMarkus Heiser
BTW: make the engines ready for search.checker: - replace eval_xpath by eval_xpath_getindex and eval_xpath_list - google_images: remove outer try/except block Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-22[fix] revise of the google-news engineMarkus Heiser
This revise is based on the methods developed in the revise of the google engine (see commit 410c2f9). Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-14[enh] engines: add about variableAlexandre Flament
move meta information from comment to the about variable so the preferences, the documentation can show these information
2020-12-03[mod] various engines: use eval_xpath* functions and searx.exceptions.*Alexandre Flament
Engine list: ahmia, duckduckgo_images, elasticsearch, google, google_images, google_videos, youtube_api
2020-10-02[mod] move extract_text, extract_url to searx.utilsAlexandre Flament
2020-10-01[fix] google engine - div classes has been renamed in HTML reultMarkus Heiser
Since 1. October 2020 google has changed the 'class' attribute of the HTML result page. Fix the xpath expressions and ignore <div class="g" ../> sections which do not match to title's xpath expression. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2020-09-22fetch google's search langs rather than ui langsMarc Abonce Seguin
2020-09-10Drop Python 2 (1/n): remove unicode string and url_utilsDalf
2020-07-08[fix] pep8Adam Tauber
2020-07-07[fix] revise google engineMarkus Heiser
this commit is picked from #1985
2019-12-07[fix] update xpaths for new google results pageMarc Abonce Seguin
2019-12-02Merge pull request #1744 from dalf/optimizationsAdam Tauber
[mod] speed optimization