summaryrefslogtreecommitdiff
path: root/searx/engines/google_videos.py
AgeCommit message (Collapse)Author
2025-09-03[mod] drop: from __future__ import annotationsMarkus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-09-03[mod] addition of various type hints / tbcMarkus Heiser
- pyright configuration [1]_ - stub files: types-lxml [2]_ - addition of various type hints - enable use of new type system features on older Python versions [3]_ - ``.tool-versions`` - set python to lowest version we support (3.10.18) [4]_: Older versions typically lack some typing features found in newer Python versions. Therefore, for local type checking (before commit), it is necessary to use the older Python interpreter. .. [1] https://docs.basedpyright.com/v1.20.0/configuration/config-files/ .. [2] https://pypi.org/project/types-lxml/ .. [3] https://typing-extensions.readthedocs.io/en/latest/# .. [4] https://mise.jdx.dev/configuration.html#tool-versions Signed-off-by: Markus Heiser <markus.heiser@darmarit.de> Format: reST
2025-07-25[fix] google video: refactor broken engine to work againSeriousConcept1134
The current google_videos.py in the master branch is completely non functional, due to it not parsing the returned video search results correctly. The result is searxng saying that no results were found. This commit is a new updated google_videos.py that's designed to fix that and is confirmed to be working. Implementing the suggestions by Bnyro. Re-formatted with `black` for compatibility. After failing automated checks, ran the command: black --line-length 120 --skip-string-normalization --target-version py311 google_videos.py
2025-03-07[doc] add missing docs for the search.max_page settingMarkus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-03-06[fix] engines: Google-Web & Google-Video (random arc_id)Markus Heiser
Both enghines have been reported ``TooManyRequests``, additionaly Google-Videos thumbnails needed a review. Based on the research from @unixfox [1] this patch generates every hour a new random ``arc_id``. [1] https://github.com/searxng/searxng/issues/4435#issuecomment-2703279522 Closes: - https://github.com/searxng/searxng/issues/4435 - https://github.com/searxng/searxng/issues/4431 Related: - https://github.com/searxng/searxng/discussions/4434 - https://github.com/searxng/searxng/discussions/4429 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-12-22[fix] engine google_video: google changed the layout of the HTML responseMarkus Heiser
Closes: https://github.com/searxng/searxng/issues/4127 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-10-03add get_embeded_stream_url to searx.utilsAustin-Olacsi
2024-05-16[mod] simple theme: drop img_src from default resultsMarkus Heiser
The use of img_src AND thumbnail in the default results makes no sense (only a thumbnail is needed). In the current state this is rather confusing, because img_src is displayed like a thumbnail (small) and thumbnail is displayed like an image (large). Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-03-11[mod] pylint all engines without PYLINT_SEARXNG_DISABLE_OPTIONMarkus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-01-16[fix] engine: safesearch parameter in Google Videos engine (#2762)Jinyuan Huang
Closes: https://github.com/searxng/searxng/issues/2762
2023-12-03[mod] add option max_pageMarkus Heiser
Related: https://github.com/searxng/searxng/issues/2982 Closes: https://github.com/searxng/searxng/issues/2972 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-08-22[fix] engine google_video: google has changed the layout of the rsponseMarkus Heiser
Closes: https://github.com/searxng/searxng/issues/2664 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24[mod] Google: reversed engineered & upgrade to data_type: traits_v1Markus Heiser
Partial reverse engineering of the Google engines including a improved language and region handling based on the engine.traits_v1 data. When ever possible the implementations of the Google engines try to make use of the async REST APIs. The get_lang_info() has been generalized to a get_google_info() function / especially the region handling has been improved by adding the cr parameter. searx/data/engine_traits.json Add data type "traits_v1" generated by the fetch_traits() functions from: - Google (WEB), - Google images, - Google news, - Google scholar and - Google videos and remove data from obsolete data type "supported_languages". A traits.custom type that maps region codes to *supported_domains* is fetched from https://www.google.com/supported_domains searx/autocomplete.py: Reversed engineered autocomplete from Google WEB. Supports Google's languages and subdomains. The old API suggestqueries.google.com/complete has been replaced by the async REST API: https://{subdomain}/complete/search?{args} searx/engines/google.py Reverse engineering and extensive testing .. - fetch_traits(): Fetch languages & regions from Google properties. - always use the async REST API (formally known as 'use_mobile_ui') - use *supported_domains* from traits - improved the result list by fetching './/div[@data-content-feature]' and parsing the type of the various *content features* --> thumbnails are added searx/engines/google_images.py Reverse engineering and extensive testing .. - fetch_traits(): Fetch languages & regions from Google properties. - use *supported_domains* from traits - if exists, freshness_date is added to the result - issue 1864: result list has been improved a lot (due to the new cr parameter) searx/engines/google_news.py Reverse engineering and extensive testing .. - fetch_traits(): Fetch languages & regions from Google properties. *supported_domains* is not needed but a ceid list has been added. - different region handling compared to Google WEB - fixed for various languages & regions (due to the new ceid parameter) / avoid CONSENT page - Google News do no longer support time range - result list has been fixed: XPath of pub_date and pub_origin searx/engines/google_videos.py - fetch_traits(): Fetch languages & regions from Google properties. - use *supported_domains* from traits - add paging support - implement a async request ('asearch': 'arc' & 'async': 'use_ac:true,_fmt:html') - simplified code (thanks to '_fmt:html' request) - issue 1359: fixed xpath of video length data searx/engines/google_scholar.py - fetch_traits(): Fetch languages & regions from Google properties. - use *supported_domains* from traits - request(): include patents & citations - response(): fixed CAPTCHA detection (Scholar has its own CATCHA manager) - hardening XPath to iterate over results - fixed XPath of pub_type (has been change from gs_ct1 to gs_cgt2 class) - issue 1769 fixed: new request implementation is no longer incompatible Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24[mod] Google: fetch engine traits (data_type: supported_languages)Markus Heiser
Implements a fetch_traits function for the Google engines. .. note:: Does not include migration of the request methode from 'supported_languages' to 'traits' (EngineTraits) object! Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-27[fix] typos / reported by @kianmeng in searx PR-3366Markus Heiser
[PR-3366] https://github.com/searx/searx/pull/3366 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-01[mod] add 'Accept-Language' HTTP header to online processoresMarkus Heiser
Most engines that support languages (and regions) use the Accept-Language from the WEB browser to build a response that fits to the language (and region). - add new engine option: send_accept_language_header Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-25[fix] google & youtube - set EU consent cookieEmilien Devos
This change the previous bypass method for Google consent using ``ucbcb=1`` (6face215b8) to accept the consent using ``CONSENT=YES+``. The youtube_noapi and google have a similar API, at least for the consent[1]. Get CONSENT cookie from google reguest:: curl -i "https://www.google.com/search?q=time&tbm=isch" \ -A "Mozilla/5.0 (X11; Linux i686; rv:102.0) Gecko/20100101 Firefox/102.0" \ | grep -i consent ... location: https://consent.google.com/m?continue=https://www.google.com/search?q%3Dtime%26tbm%3Disch&gl=DE&m=0&pc=irp&uxe=eomtm&hl=en-US&src=1 set-cookie: CONSENT=PENDING+936; expires=Wed, 24-Jul-2024 11:26:20 GMT; path=/; domain=.google.com; Secure ... PENDING & YES [2]: Google change the way for consent about YouTube cookies agreement in EU countries. Instead of showing a popup in the website, YouTube redirects the user to a new webpage at consent.youtube.com domain ... Fix for this is to put a cookie CONSENT with YES+ value for every YouTube request [1] https://github.com/iv-org/invidious/pull/2207 [2] https://github.com/TeamNewPipe/NewPipeExtractor/issues/592 Closes: https://github.com/searxng/searxng/issues/1432
2022-07-09bypass google consent with ucbcb=1Emilien Devos
2022-01-05[enh] add more categoriesMartin Fischer
2021-12-27[format.python] initial formatting of the python codeMarkus Heiser
This patch was generated by black [1]:: make format.python [1] https://github.com/psf/black Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-11-26[fix] google-videos engine: ignore news articlesMarkus Heiser
In the video search, google also sometimes includes news. E.g. in the DE language when you search for `!gov paris`, google adds an article from a german newspaper (FAZ), I assume these are sponsored link (not tagged advertisement?) Those links do not have an image / this patch ignores *video links* wqithout an image ID. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-11-26[fix] google video engine - rework of the HTML parserMarkus Heiser
The google video response has been changed slightly, a rework of the parser was needed. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-11-25[fix] google engine - suggestionMarkus Heiser
BTW: google no longer offers *spelling suggestions* Closes: https://github.com/searxng/searxng/issues/442 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-07[pylint] engines: drop no longer needed 'missing-function-docstring'Markus Heiser
Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914168470 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-07[fix] drop useless pylint: disable=undefined-variableMarkus Heiser
Since 7b235a1 (see line 591) it is no longer needed to disable 'undefined-variable' for names defined in:: PYLINT_ADDITIONAL_BUILTINS_FOR_ENGINES Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914068609 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-06[mod] one logger per engine - drop obsolete logger.getChildMarkus Heiser
Remove the no longer needed `logger = logger.getChild(...)` from engines. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-08-31[pylint] Pylint 2.10 - fix use-list-literal & use-dict-literalMarkus Heiser
Pylint 2.10 added new default checks [1]: use-list-literal Emitted when list() is called with no arguments instead of using [] use-dict-literal Emitted when dict() is called with no arguments instead of using {} [1] https://pylint.pycqa.org/en/latest/whatsnew/2.10.html Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-21[docs] add documentation from the sources of the google enginesMarkus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-11[fix] log messages from: google- images, news, scholar, videosMarkus Heiser
- HTTP header Accept-Language --> lang_info['headers']['Accept-Language'] - remove obsolete query_url log messages which is already logged by httpx._client:HTTP request Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-10[enh] google engine: supports "default language"Alexandre Flament
Same behaviour behaviour than Whoogle [1]. Only the google engine with the "Default language" choice "(all)"" is changed by this patch. When searching for a locate place, the result are in the expect language, without missing results [2]: > When a language is not specified, the language interpretation is left up to > Google to decide how the search results should be delivered. The query parameters are copied from Whoogle. With the ``all`` language: - add parameter ``source=lnt`` - don't use parameter ``lr`` - don't add a ``Accept-Language`` HTTP header. The new signature of function ``get_lang_info()`` is: lang_info = get_lang_info(params, lang_list, custom_aliases, supported_any_language) Argument ``supported_any_language`` is True for google.py and False for the other google engines. With this patch the function now returns: - query parameters: ``lang_info['params']`` - HTTP headers: ``lang_info['headers']`` - and as before this patch: - ``lang_info['subdomain']`` - ``lang_info['country']`` - ``lang_info['language']`` [1] https://github.com/benbusby/whoogle-search [2] https://github.com/benbusby/whoogle-search/releases/tag/v0.5.4
2021-04-26[pylint] tag PYLINT_FILES by comment `# lint: pylint`Markus Heiser
These py files are linted by `test.pylint`, all other files are linted by `test.pep8`. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28[fix] normalize the language & region aspects of all google enginesMarkus Heiser
BTW: make the engines ready for search.checker: - replace eval_xpath by eval_xpath_getindex and eval_xpath_list - google_images: remove outer try/except block Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-24[fix] google-videos: parse values for 'length' & 'author'Markus Heiser
The 'video.html' template from the 'oscar' design supports replacement for *author* and *length*. Google-videos does not have an author, alternatively the publisher info from is used for the *author*. Hint: these replacements are not supported by the 'simple' design. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-24[fix] revise of the google-Video engineMarkus Heiser
This revise is based on the methods developed in the revise of the google engine (see commit 410c2f9). Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-14[enh] engines: add about variableAlexandre Flament
move meta information from comment to the about variable so the preferences, the documentation can show these information
2020-12-03[mod] various engines: use eval_xpath* functions and searx.exceptions.*Alexandre Flament
Engine list: ahmia, duckduckgo_images, elasticsearch, google, google_images, google_videos, youtube_api
2020-11-14[mod] remove unused importAlexandre Flament
use from searx.engines.duckduckgo import _fetch_supported_languages, supported_languages_url # NOQA so it is possible to easily remove all unused import using autoflake: autoflake --in-place --recursive --remove-all-unused-imports searx tests
2020-10-02[mod] move extract_text, extract_url to searx.utilsAlexandre Flament
2020-09-10Drop Python 2 (1/n): remove unicode string and url_utilsDalf
2019-07-31[fix] google_videos engine: some results don't a thumbnailDalf
2019-01-05[mod] google videosVenca24
2019-01-04[FIX] google videos thumbnailsVenca24
2018-11-22[fix] google videos engineVenca24
2017-07-26add google videosmarc