| Age | Commit message (Collapse) | Author |
|
Block requests from PetalBlock. Normally robots.txt is enough to stop
PetalBlock from making requests [1]. However, if SearXNG is offered below a
path (example.org/search), then the robots.txt is not available in the root
paths of the domain / subdomain.
[1] https://webmaster.petalsearch.com/site/petalbot
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
Since [bb3a01f8] has been merged to the Farside project, Farside instances do no
longer need to send requests to SearXNG instances [1].
There are some old unmaintained Farside instances on the web that continue to
query SearXNG instances --> we can safely block their requests.
[1] https://github.com/benbusby/farside/issues/95
[bb3a01f8] https://github.com/benbusby/farside/commit/bb3a01f8
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
Related: https://github.com/searxng/searxng/issues/2310#issuecomment-1494417531
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
- requests without HTTP header 'Connection' or missing 'User-Agent' will be
blocked by the limiter
- re_bot is related to 'User-Agent' and has been renamed to block_user_agent
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
In debug mode more detailed logging is needed to evaluate if an access should
have been blocked by the limiter.
BTW: remove duplicate code checking bot signature ``re_bot.match(user_agent)``
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
When the user choose "Auto-detected", the choice remains on the following queries.
The detected language is displayed.
For example "Auto-detected (en)":
* the next query language is going to be auto detected
* for the current query, the detected language is English.
This replace the autodetect_search_language plugin.
|
|
Related: https://github.com/searxng/searxng/pull/2189
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
|
|
|
|
[fix] follow up of PR-1856
|
|
- Add documentation to the plugin
- Harmonize FastText language model with SearXNG's language model
Reosurces::
import fasttext # --> +10 MB
fasttext.load_model(str(data_dir / 'lid.176.ftz')) # --> +4MB
Suggested-by: @dalf
- To speed up and simplify the deployment use fasttext-wheel instead of fasttext
- Building numpy on the Alpine Linux of docker-images takes ages --> install
py3-numpy from Alpines package manager (apk)
- Alpine Linux on docker-images (musl libc) do not support fasttext-wheel (gnu
libc) --> patch Dockerfile and build from fastetxt:
sed -i s/fasttext-wheel/fasttext/ requirements.txt
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
|
|
Remove the abstraction in searx.shared.SharedDict.
Implement a basic and dedicated scheduler for the checker using a Redis script.
|
|
[PR-3366] https://github.com/searx/searx/pull/3366
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
Currentty, when oa_doi_rewrite find a DOI in the result URL, it replace the URL.
In this commit, the plugin adds the key "doi" to the result,
so the paper.html can show it.
|
|
Only raise "suspicious Accept-Encoding" when both "gzip" and "deflate" are missing from Accept-Encoding.
Prevent Browsers which only implement one compression solution from being blocked by the limiter plugin.
Example Browser which is currently blocked: Lynx Browser (https://lynx.invisible-island.net)
|
|
|
|
|
|
Add a redis library to generalize DB functions we need in SearXNG.
|
|
Remove issue reported by Pylint 2.14.0:
- no-self-use: has been moved to optional extension [1]
- The refactoring checker now also raises 'consider-using-generator' messages
for max(), min() and sum(). [2]
.pylintrc:
- <option name>-hint has been removed since long, Pylint 2.14.0 raises an
error on invalid options
- bad-continuation and bad-whitespace have been removed [3]
[1] https://pylint.pycqa.org/en/latest/whatsnew/2/2.14/summary.html#removed-checkers
[2] https://pylint.pycqa.org/en/latest/whatsnew/2/2.14/full.html#what-s-new-in-pylint-2-14-0
[2] https://pylint.pycqa.org/en/latest/whatsnew/2/2.6/summary.html#summary-release-highlights
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
|
|
Requested-by: https://github.com/searxng/searxng/discussions/993#discussioncomment-2396914
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
* oscar theme: code from searx/plugins/infinite_scroll.py
* simple theme: new implementation
Co-authored-by: Markus Heiser <markus.heiser@darmarIT.de>
|
|
[limiter] update
|
|
Rename result field data_src to iframe_src
Suggested-by: @dalf https://github.com/searxng/searxng/pull/882#issuecomment-1037997402
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
This is a rewrite of the hostname_replace.py that:
- don't stop to replace URL in fields ('data_src', 'audio_src') if there isn't a
'parsed_url',
- adds a comment about keep or remove a result from the result list
- adds a loop over ['data_src', 'audio_src'] instead of doubling code lines
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
Embedded HTML breaks SearXNG architecture. To modularize, HTML is generated in
the templates (oscar & simple) and result parameter 'embedded' is replaced by
'data_src' (and 'audio_src'), an URL for embedded content (<iframe>).
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
To test you need to redirect embeded videos (e.g.) from youtube to a invidios
instance. Search for videos using engine `!youtube lebowski`. The result URLs
and the embeded videos should link to the invidios instance.
Here is an example of such a `hostname_replace` configuration::
hostname_replace:
# youtube --> Invidious
'(.*\.)?youtube-nocookie\.com': 'invidio.xamh.de'
'(.*\.)?youtube\.com$': 'invidio.xamh.de'
'(.*\.)?invidious\.snopyta\.org$': 'invidio.xamh.de'
'(.*\.)?vid\.puffyan\.us': 'invidio.xamh.de'
'(.*\.)?invidious\.kavin\.rocks$': 'invidio.xamh.de'
'(.*\.)?inv\.riverside\.rocks$': 'invidio.xamh.de'
Closes: https://github.com/searxng/searxng/issues/873
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
also adjust the number of req/time
|
|
can replace filtron:
* rate limite the number of request per IP and per (IP, User-Agent)
* block some bots
use Redis
data stored in Redis never contains the IP addresses, only HMAC using the secret_key
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
Previously the Setting classes used a horrible _post_init
hack that prevented proper type checking.
|
|
This patch was generated by black [1]::
make format.python
[1] https://github.com/psf/black
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
Disable the python code formatting from python-black, where the readability of
code suffers by formatting.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
add a new function "init" call when the app starts.
The function can:
* return False to disable the plugin.
* modify the Flask app.
|
|
* backport of https://github.com/searx/searx/pull/2724
* allow to remove result if the replacement is the boolean value false
|
|
see commit 6c9ae7911e9639bc46cd53af215734b4bdb61ba9
|
|
doi_resolvers.get_value('preferences') already contains the value from
request.args.get('doi_resolver')
|
|
required attributes: display a different message
when the attribute has the wrong type
|
|
First details about the general tab, then detail about UI tab, etc...
No functionnal change
|
|
close #115
|
|
|
|
The usefulness of the _HTTPS rewrite_ plugin is questionable:
- the 36 rule files have not been updated since 2015 [1]
- actual there are 23760 rule files in the https-everywhere repo [2]
For the first, we can remove this plugin. For a complete new implementation, it
might be good to know that there is a project "https-everywhere : Privacy for
Pythons" [3]
related: https://github.com/return42/searx-next/issues/8
[1] https://github.com/return42/searx-next/tree/d187a1d/searx/plugins/https_rules
[2] https://github.com/EFForg/https-everywhere/tree/master/src/chrome/content/rules
[3] https://github.com/jayvdb/https-everywhere-py
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|