| Age | Commit message (Collapse) | Author |
|
Implementations of the *traits* of the engines.
Engine's traits are fetched from the origin engine and stored in a JSON file in
the *data folder*. Most often traits are languages and region codes and their
mapping from SearXNG's representation to the representation in the origin search
engine.
To load traits from the persistence::
searx.enginelib.traits.EngineTraitsMap.from_data()
For new traits new properties can be added to the class::
searx.enginelib.traits.EngineTraits
.. hint::
Implementation is downward compatible to the deprecated *supported_languages
method* from the vintage implementation.
The vintage code is tagged as *deprecated* an can be removed when all engines
has been ported to the *traits method*.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
The request function should not request a language (aka locale) that is not
supported by qwant. Select a locale like zh-TW ends in qwant's API error:
ERROR searx.engines.qwant news: exception : \
API error::locale must be one of the following values: \
en_gb, en_ie, en_us, en_ca, en_my, en_au, en_nz, de_de, de_ch, de_at, fr_fr, \
fr_be, fr_ch, fr_ca, fr_ad, fc_ca, co_fr, es_es, es_ar, es_cl, es_co, es_mx, \
es_pe, es_ad, ca_es, ca_ad, ca_fr, eu_es, eu_fr, it_it, it_ch, pt_pt, pt_ad, \
nl_be, nl_nl
The existing searx.utils.match_language function is unsuitable for this purpose,
it is replaced by function searx.locales.get_engine_locale that is based on the
methods from the babel package.
The quant's _fetch_supported_languages function has been revised to filter out
languages 8aka locales) not supported by qwant.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
|
|
In PR #1071 the language catalog of dailymotion has been cleaned up, before
there had been over 7000 "languages" in the catalog.
As a side effect of this clean-up the language & region catalog in SearXNG has
been reduced [1].
This patch reduce the ``min_engines_per_lang`` from 13 to 12 to get the missed
languages back in language & region catalog of SearXNG.
[1] https://github.com/searxng/searxng/pull/1071/commits/3bb62823ec3af0e67bd2d959bec20c4791ee3bac#diff-f3f00db0f87f95b882624a192e0aac21525638af0b18c9514e765fcf1991678d
Requested-by: @tiekoetter in a Matrix chat
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
- fix the issue of fetching more the 7000 *languages*
- improve the request function and filter by language & country
- implement time_range_support & safesearch
- add more fields to the response from dailymotion (allow_embed, length)
- better clean up of HTML tags in the 'content' field.
This is more or less a complete rework based on the '/videos' API from [1].
This patch cleans up the language list in SearXNG that has been polluted by the
ISO-639-3 2 and 3 letter codes from dailymotion languages which have never been
used.
[1] https://developers.dailymotion.com/tools/
Closes: https://github.com/searxng/searxng/issues/1065
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
Languages are supported by mapping the language to a domain. If domain is not
found in :py:obj:`lang2domain` URL ``<lang>.search.yahoo.com`` is used.
BTW: fix issue reported at https://github.com/searx/searx/issues/3020
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
|
|
The implementation uses the Qwant API (https://api.qwant.com/v3). The API is
undocumented but can be reverse engineered by reading the network log of
https://www.qwant.com/ queries.
This implementation is used by different qwant engines in the settings.yml::
- name: qwant
categories: general
...
- name: qwant news
categories: news
...
- name: qwant images
categories: images
...
- name: qwant videos
categories: videos
...
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
|
|
|
|
|
|
The old xpath configuration for google scholar did not work and is replaced by a
python implementation.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
|
|
|
|
build by Makefile target:
make project
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
To get meaningfull diffs, the json file has to be sorted. Before applying any
further content patch, the json file needs a inital sort (without changing any
content).
Sorted by::
import sys, json
with open('engines_languages.json') as f:
j = json.load(f)
with open('engines_languages.json', 'w') as f:
json.dump(j, f, indent=2, sort_keys=True)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
Instead of a single line with 500000 characters use nicely formatted JSON.
Sort the lists in engine_languages.py so when updating it is possible to
more easily see the differences (search engines do change the order their
languages are listed in)
|
|
Add match_language function in utils to match any user given
language code with a list of engine's supported languages.
Also add language_aliases dict on each engine to translate
standard language codes into the custom codes used by the engine.
|
|
Add languages supported by either all default general engines or 10 engines.
|
|
Also, fix fetch_languages.py so it can run on python3.
|
|
|
|
languages.py can change, so users may query on a language that is not
on the list anymore, even if it is still recognized by a few engines.
also made no and nb the same because they seem to return the same,
though most engines will only support one or the other.
|
|
closes issue #863
|
|
that support them.
users can still query lesser supported through the :lang_code bang.
|
|
|
|
and refactor method to make it testable without making requests
|
|
utils/fetch_languages.py gets languages supported by each engine and
generates engines_languages.json with each engine's supported language.
|