summaryrefslogtreecommitdiff
path: root/searx/engines
AgeCommit message (Collapse)Author
2025-05-03[mod] engines: migration of the individual cache solutions to EngineCacheMarkus Heiser
The EngineCache class replaces all previously individual solutions for caches in the context of the engines. - demo_offline.py - duckduckgo.py - radio_browser.py - soundcloud.py - startpage.py - wolframalpha_api.py - wolframalpha_noapi.py Search term to test most of the modified engines:: !ddg !rb !sc !sp !wa test !ddg !rb !sc !sp !wa foo For introspection of the DB, jump into developer environment and run command to show cache state:: $ ./manage pyenv.cmd bash --norc --noprofile (py3) python -m searx.enginelib cache state cache tables and key/values =========================== [demo_offline ] 2025-04-22 11:32:50 count --> (int) 4 [startpage ] 2025-04-22 12:32:30 SC_CODE --> (str) fSOBnhEMlDfE20 [duckduckgo ] 2025-04-22 12:32:31 4dff493e.... --> (str) 4-128634958369380006627592672385352473325 [duckduckgo ] 2025-04-22 12:40:06 3e2583e2.... --> (str) 4-263126175288871260472289814259666848451 [radio_browser ] 2025-04-23 11:33:08 servers --> (list) ['https://de2.api.radio-browser.info', ...] [soundcloud ] 2025-04-29 11:40:06 guest_client_id --> (str) EjkRJG0BLNEZquRiPZYdNtJdyGtTuHdp [wolframalpha ] 2025-04-22 12:40:06 code --> (str) 5aa79f86205ad26188e0e26e28fb7ae7 number of tables: 6 number of key/value pairs: 7 In the "cache tables and key/values" section, the table name (engine name) is at first position on the second there is the calculated expire date and on the third and fourth position the key/value is shown. About duckduckgo: The *vqd coode* of ddg depends on the query term and therefore the key is a hash value of the query term (to not to store the raw query term). In the "properties of ENGINES_CACHE" section all properties of the SQLiteAppl / ExpireCache and their last modification date are shown:: properties of ENGINES_CACHE =========================== [last modified: 2025-04-22 11:32:27] DB_SCHEMA : 1 [last modified: 2025-04-22 11:32:27] LAST_MAINTENANCE : [last modified: 2025-04-22 11:32:27] crypt_hash : ca612e3566fdfd7cf7efe2b1c9349f461158d07cb78a3750e5c5be686aa8ebdc [last modified: 2025-04-22 11:32:30] CACHE-TABLE--demo_offline: demo_offline [last modified: 2025-04-22 11:32:30] CACHE-TABLE--startpage: startpage [last modified: 2025-04-22 11:32:31] CACHE-TABLE--duckduckgo: duckduckgo [last modified: 2025-04-22 11:33:08] CACHE-TABLE--radio_browser: radio_browser [last modified: 2025-04-22 11:40:06] CACHE-TABLE--soundcloud: soundcloud [last modified: 2025-04-22 11:40:06] CACHE-TABLE--wolframalpha: wolframalpha These properties provide information about the state of the ExpireCache and control the behavior. For example, the maintenance intervals are controlled by the last modification date of the LAST_MAINTENANCE property and the hash value of the password can be used to detect whether the password has been changed (in this case the DB entries can no longer be decrypted and the entire cache must be discarded). Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-05-02[fix] semantic scholar: method not allowed / engine doesn't workBnyro
Fixes the semantic scholar engine by extracting a ui version token. BTW: remove html tags from the content. Author's checklist: - they are ratelimiting very fast, if you do approx more than 2 requests per minute, you have to wait some time again... - they also have an official api at api.semanticscholar.org, but it's ratelimits are even harder Closes: https://github.com/searxng/searxng/issues/4685
2025-05-02[feat] engine ChinaSo: support source filter for ChinaSo-NewsBrandonStudio
* filtering ChinaSo-News results by source, option ``chinaso_news_source`` * add ChinaSo engine to the online docs https://docs.searxng.org/dev/engines/online/chinaso.html * fix SearXNG categories in the settings.yml * deactivate ChinaSo engines ``inactive: true`` until [1] is fixed * configure network of the ChinaSo engines [1] https://github.com/searxng/searxng/issues/4694 Signed-off-by: @BrandonStudio Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2025-04-30[fix] brave: fix images and videos enginesBnyro
2025-04-29[fix] pdia: dynamically fetch API key config file locationDenperidge
As suggested by @Bnyro at https://github.com/searxng/searxng/pull/4652#discussion_r2055760390 !
2025-04-29[fix] engine yahoo: replace fetch_traits by a list of languagesMarkus Heiser
The Yahoo engine's fetch_traits function has been encountering an error in CI jobs for several months [1], thus aborting the process for all other engines as well. The language selection dialog (which fetch_traits calls) requires an `EuConsent` cookie. Strangely, the cookie is not needed for searching, which is why the engine itself still works. Since Yahoo won't be conquering any new marketplaces in the foreseeable future, it should be sufficient to hard-implement the list of currently available languages ​​(`yahoo_languages`). [1] https://github.com/searxng/searxng/actions/runs/14720458830/job/41313149268 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-04-24[fix] fix Quark engine callingZhijie He
2025-04-23[fix] typo in soundcloud engineMarkus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-04-23[fix] engine: re-implement mullvad leta integrationGrant Lanham
Re-writes the Mullvad Leta integration to work with the new breaking changes. Mullvad Leta is a search engine proxy. Currently Leta only offers text search results not image, news or any other types of search result. Leta acts as a proxy to Google and Brave search results. - Remove docstring comments regarding requiring the use of Mullvad VPN, which is no longer a hard requirement. - configured two engines: ``mullvadleta`` (uses google) and ``mullvadleta brave`` (uses brave) - since leta may not provide up-to-date search results, both search engines are disabled by default. .. hint:: Leta caches each search for up to 30 days. For example, if you use search terms like ``news``, contrary to your intention you'll get very old results! Co-authored-by: Markus Heiser <markus.heiser@darmarit.de> Signed-off-by: Grant Lanham <contact@grantlanham.com>
2025-04-18[feat] engine: add Steam engineZhijie He
2025-04-17[feat] engines: add Hugging Face engineZhijie He
2025-04-17[feat] engine: add engine for italian press agency ansaTommaso Colella
2025-04-17[feat] add SensCritique (FR) engineRobinFrcd
Closes: https://github.com/searxng/searxng/issues/4623
2025-04-12[feat] engine: add microsoft learn engineTommaso Colella
2025-04-11[fix] engine dokuwiki: basedir duplicationgrasdk
Dokuwiki searches behind reverse proxy had duplicate base path in the url, creating a wrong url. This patch exchanges string concat of URLs with urljoin [1] from urllib.parse. This eliminates the dual problem, while retaining the old functionality designed to concatenate protocol, hostname and port (as base_url) with path. [1] https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urljoin Closes: https://github.com/searxng/searxng/issues/4598
2025-04-09[fix] make docs - ERROR: Unknown target name: "auth_key"Markus Heiser
BTW: fix a bug with sys.path: repo-root (not util) needs to added to generate autodoc from scripts in ./searxng_extra Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-04-07[fix] Meilisearch engine: Authorization Token When Integrating MeilisearchMarkus Heiser
`X-Meili-API-Key` has been changed to `Authorization` [1] [1] https://www.meilisearch.com/docs/reference/api/overview#authorization Suggested-by: https://github.com/searxng/searxng/issues/4416#issuecomment-2781254841 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-04-06[fix] engine radio browser: get servers from DNS api.radio-browser.infoMarkus Heiser
Do a DNS-lookup of 'all.api.radio-browser.info', add reverse lookup and select randomly a URL from available servers Closes: https://github.com/searxng/searxng/issues/4576 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-04-01[fix] hardening against arguments of type None, where str or dict is expectedMarkus Heiser
On a long-running server, the tracebacks below can be found (albeit rarely), which indicate problems with NoneType where a string or another data type is expected. result.img_src:: File "/usr/local/searxng/searxng-src/searx/templates/simple/result_templates/images.html", line 13, in top-level template code <img src="" data-src="{{ image_proxify(result.img_src) }}" alt="{{ result.title|striptags }}">{{- "" -}} ^ File "/usr/local/searxng/searxng-src/searx/webapp.py", line 284, in image_proxify if url.startswith('//'): ^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'startswith' result.content:: File "/usr/local/searxng/searxng-src/searx/result_types/_base.py", line 105, in _normalize_text_fields result.content = WHITESPACE_REGEX.sub(" ", result.content).strip() ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^ TypeError: expected string or bytes-like object, got 'NoneType' html_to_text, when html_str is a NoneType:: File "/usr/local/searxng/searxng-src/searx/engines/wikipedia.py", line 190, in response title = utils.html_to_text(api_result.get('titles', {}).get('display') or api_result.get('title')) File "/usr/local/searxng/searxng-src/searx/utils.py", line 158, in html_to_text html_str = html_str.replace('\n', ' ').replace('\r', ' ') ^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'replace' presearch engine, when json_resp is a NoneType:: File "/usr/local/searxng/searxng-src/searx/engines/presearch.py", line 221, in response results = parse_search_query(json_resp.get('results')) File "/usr/local/searxng/searxng-src/searx/engines/presearch.py", line 161, in parse_search_query for item in json_results.get('specialSections', {}).get('topStoriesCompact', {}).get('data', []): ^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'get' Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-03-30[feat] engines: add Ollama engineZhijie He
2025-03-30[feat] engines: add reuters news engineBnyro
2025-03-30[feat] engine: add engine for italian online newspaper "il post"Tommaso Colella
2025-03-30[feat] engines: add Quark engineZhijie He
Co-authored-by: Bnyro <bnyro@tutanota.com>
2025-03-30[feat] engines: add Niconico videos engineZhijie He
Co-authored-by: Bnyro <bnyro@tutanota.com>
2025-03-30[feat] engine: add bitchutenaughtymommy42069
2025-03-28[fix] presearch engine: Unexpected crash if duration not in videosAadniz
2025-03-27[fix] make docs -> ERROR: Unknown target name: "google: max 50 pages".Markus Heiser
Fix the issues reported by sphinx build:: docstring of searx.engines.google.max_page:1: ERROR: Unknown target name: "google: max 50 pages". docstring of searx.engines.google_images.max_page:1: ERROR: Unknown target name: "google: max 50 pages". docstring of searx.engines.google_scholar.max_page:1: ERROR: Unknown target name: "google: max 50 pages". Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-03-27[fix] baidu engine: properly decoding HTML escape codesAadniz
2025-03-25[refactor] duration strings: move parsing logic to utils.pyBnyro
2025-03-25[fix] duckduckgo news: unescaped html sequences in descriptionBnyro
2025-03-21[fix] typo in doc-str: offical -> officialIkko Eltociear Ashimine
2025-03-21[fix] duckduckgo: answer sometimes contains faulty (duplicated) urlBnyro
2025-03-20[fix] presearch videos: item description and duration are located in ↵Bnyro
metadata field
2025-03-20[fix] presearch engine: News and Videos formatted incorrectlyAadniz
2025-03-19[fix] engine: core.ac.uk implement API v3 / v2 is no longer supportedTan Yong Sheng
2025-03-18[fix] duckduckgo: show proper source url of answersBnyro
2025-03-17[feat] engine: add selfh.st/icons for logos of common self-hosted programsBnyro
2025-03-16[engine] elasticsearch: add pagination supportBnyro
2025-03-15fixup! [fix] fix invalid escape error in Baidu Images & default config typoMarkus Heiser
2025-03-15[fix] fix invalid escape error in Baidu Images & default config typoZhijie He
2025-03-15[feat]: engines add images & kaifa from baidu.comZhijie He
2025-03-15[mod] migrate all key-value.html templates to KeyValue typeMarkus Heiser
The engines now all use KeyValue results and return the results in a EngineResults object. The sqlite engine can return MainResult results in addition to KeyValue results (based on engine's config in settings.yml), Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-03-08[fix] presearch engine: domain sometimes included in beginning of titlesAadniz
2025-03-08[feat] add bilibili support to get get_embeded_stream_urlAustin-Olacsi
2025-03-07[fix] presearch engine: Title showing <em> html codeAadniz
2025-03-07[fix] set language for engines from chinese market (no i18n index nor UI)Markus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-03-07[fix] engine qwant: add tgp and llm arguments to avoid CAPTCHALoris
2025-03-07[doc] add missing docs for the search.max_page settingMarkus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-03-07[feat] engines: add baidu (general)Bubu
2025-03-06[feat] engines: add www.acfun.cnZhijie He