| Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
Add Apache Solr engine
|
|
[fix] remove unused import from yahoo-news engine
|
|
Signed-off-by: Markus Heiser <markus@darmarit.de>
|
|
- https://www.acgsou.com/ acgsou.com is redirected to 36dm.club
- @rinpatch do not plan on maintaining the engine [1]
[1] https://github.com/searx/searx/pull/1283#issuecomment-798783585
Signed-off-by: Markus Heiser <markus@darmarit.de>
|
|
|
|
Add Solid Torrents engine
|
|
[mod] by default allow only HTTPS, not HTTP
|
|
BTW: make the code slightly more readable
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
Many things have been changed since last review of this engine. This patch fix
xpath selectors, implements suggestion and is a complete review / rewrite of the
engine.
Signed-off-by: Markus Heiser <markus@darmarit.de>
|
|
Related to https://github.com/searx/searx/pull/2373
|
|
|
|
|
|
When initing engines a "SearxEngineResponseException" is logged very verbose,
including full traceback information:
ERROR:searx.engines:yggtorrent engine: Fail to initialize
Traceback (most recent call last):
File "share/searx/searx/engines/__init__.py", line 293, in engine_init
init_fn(get_engine_from_settings(engine_name))
File "share/searx/searx/engines/yggtorrent.py", line 42, in init
resp = http_get(url, allow_redirects=False)
File "share/searx/searx/poolrequests.py", line 197, in get
return request('get', url, **kwargs)
File "share/searx/searx/poolrequests.py", line 190, in request
raise_for_httperror(response)
File "share/searx/searx/raise_for_httperror.py", line 60, in raise_for_httperror
raise_for_captcha(resp)
File "share/searx/searx/raise_for_httperror.py", line 43, in raise_for_captcha
raise_for_cloudflare_captcha(resp)
File "share/searx/searx/raise_for_httperror.py", line 30, in raise_for_cloudflare_captcha
raise SearxEngineCaptchaException(message='Cloudflare CAPTCHA', suspended_time=3600 * 24 * 15)
searx.exceptions.SearxEngineCaptchaException: Cloudflare CAPTCHA, suspended_time=1296000
For SearxEngineResponseException this is not needed. Those types of exceptions
can be a normal use case. E.g. for CAPTCHA errors like shown in the example
above. It should be enough to log a warning for such issues:
WARNING:searx.engines:yggtorrent engine: Fail to initialize // Cloudflare CAPTCHA, suspended_time=1296000
closes: #2612
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
The old xpath configuration for google scholar did not work and is replaced by a
python implementation.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
Fix fetch_languages for Bing
|
|
Add freesound engine with player.
Co-authored-by: Gazoil <maildeguzel@gmail.com>
|
|
|
|
Bing has a list of regions that it supports and some of these regions
may have more than one possible language.
In some cases, like Switzerland, these languages are always shown as
options, so there is no issue. But in other cases, like Andorra, Bing
will only show one language at the time, either the region's default or
the request's language if the latter is supported by that region.
For example, if the HTTP request is in French, Andorra will appear as
fr-AD but if the same page is requested in any other language Andorra
will appear as ca-AD.
This is specially a problem when Bing assumes that the request is in
English because it overrides enough language codes to make several major
languages like Arabic dissappear from the languages.py file.
To avoid that issue, I set the Accept-Language header to a language
that's only supported in one region to hopefully avoid these overrides.
|
|
|
|
Update rumble.py
some lines too long.
Disable Rumble engine
disabled : True
PEP8 fix
change line spacing
|
|
update yggtorrent url + add it back
|
|
|
|
At the moment videos without a description are not shown - setting
default content to "" fixes this.
Another current bug is that thumbnails are not displayed. This is caused
by a double slash in the url. For this every trailing slash is now
stripped (for backwards compatibility) and the API response is correctly
parsed.
|
|
[remove] yandex engine
|
|
* searx understand "!ddg !g time" as : send "!g time" to DDG
* !g a DDG bang for Google: DDG return a HTTP redirect to Google
This commit adds a the allows_redirect param not to follow HTTP redirect.
The DDG engine returns a empty result as before without HTTP redirect.
|
|
[mod] json_engine: add content_html_to_text and title_html_to_text
|
|
[upd] wikipedia engine: return an empty result on query with illegal characters
|
|
[fix] fix seznam engine
|
|
Fix duckduckgo
|
|
[enh] add engine MediathekViewWeb (API)
|
|
|
|
no paging support
|
|
on some queries (like an IT error message), wikipedia returns an HTTP error 400.
this commit returns an empty result instead of showing an error to the user.
|
|
Some JSON API returns HTML in either in the HTML or the content.
This commit adds two new parameters to the json_engine:
content_html_to_text and title_html_to_text, False by default.
If True, then the searx.utils.html_to_text removes the HTML tags.
Update crossref, openairedatasets and openairepublications engines
|
|
[Engine] Add Library of Congress engine
|
|
After the main request, send a second to https://duckduckgo.com/t/sl_h
See https://github.com/searx/searx/issues/2259
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
|
|
|
|
|
|
|
|
The language_support variable is set to True by default,
and set to False in only 5 engines.
Except the documentation and the /config URL, this variable is not used.
This commit remove the variable definition in the engines, and
set value according to supported_languages length: False when the length is 0,
True otherwise.
Close #2485
|
|
Avoid SearxEngineXPathException errors when parsing non valid results::
.//div[@class="yuRUbf"]//a/@href index 0 not found
Traceback (most recent call last):
File "./searx/engines/google.py", line 274, in response
url = eval_xpath_getindex(result, href_xpath, 0)
File "./searx/searx/utils.py", line 608, in eval_xpath_getindex
raise SearxEngineXPathException(xpath_spec, 'index ' + str(index) + ' not found')
searx.exceptions.SearxEngineXPathException: .//div[@class="yuRUbf"]//a/@href index 0 not found
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
BTW: make the engines ready for search.checker:
- replace eval_xpath by eval_xpath_getindex and eval_xpath_list
- google_images: remove outer try/except block
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
The 'video.html' template from the 'oscar' design supports replacement
for *author* and *length*. Google-videos does not have an author, alternatively
the publisher info from is used for the *author*.
Hint: these replacements are not supported by the 'simple' design.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
This revise is based on the methods developed in the revise of the google engine
(see commit 410c2f9).
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|