summaryrefslogtreecommitdiff
path: root/tests/unit/test_utils.py
AgeCommit message (Collapse)Author
2025-08-18[fix] revision of utils.HTMLTextExtractor (#5125)Markus Heiser
Related: - https://github.com/searxng/searxng/pull/5073#issuecomment-3196282632
2025-07-26[fix] cleanup: rename `searx` leftovers to `SearXNG` (#5049)Markus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-01-28[refactor] typification of SearXNG (initial) / result items (part 1)Markus Heiser
Typification of SearXNG ======================= This patch introduces the typing of the results. The why and how is described in the documentation, please generate the documentation .. $ make docs.clean docs.live and read the following articles in the "Developer documentation": - result types --> http://0.0.0.0:8000/dev/result_types/index.html The result types are available from the `searx.result_types` module. The following have been implemented so far: - base result type: `searx.result_type.Result` --> http://0.0.0.0:8000/dev/result_types/base_result.html - answer results --> http://0.0.0.0:8000/dev/result_types/answer.html including the type for translations (inspired by #3925). For all other types (which still need to be set up in subsequent PRs), template documentation has been created for the transition period. Doc of the fields used in Templates =================================== The template documentation is the basis for the typing and is the first complete documentation of the results (needed for engine development). It is the "working paper" (the plan) with which further typifications can be implemented in subsequent PRs. - https://github.com/searxng/searxng/issues/357 Answer Templates ================ With the new (sub) types for `Answer`, the templates for the answers have also been revised, `Translation` are now displayed with collapsible entries (inspired by #3925). !en-de dog Plugins & Answerer ================== The implementation for `Plugin` and `Answer` has been revised, see documentation: - Plugin: http://0.0.0.0:8000/dev/plugins/index.html - Answerer: http://0.0.0.0:8000/dev/answerers/index.html With `AnswerStorage` and `AnswerStorage` to manage those items (in follow up PRs, `ArticleStorage`, `InfoStorage` and .. will be implemented) Autocomplete ============ The autocompletion had a bug where the results from `Answer` had not been shown in the past. To test activate autocompletion and try search terms for which we have answerers - statistics: type `min 1 2 3` .. in the completion list you should find an entry like `[de] min(1, 2, 3) = 1` - random: type `random uuid` .. in the completion list, the first item is a random UUID Extended Types ============== SearXNG extends e.g. the request and response types of flask and httpx, a module has been set up for type extensions: - Extended Types --> http://0.0.0.0:8000/dev/extended_types.html Unit-Tests ========== The unit tests have been completely revised. In the previous implementation, the runtime (the global variables such as `searx.settings`) was not initialized before each test, so the runtime environment with which a test ran was always determined by the tests that ran before it. This was also the reason why we sometimes had to observe non-deterministic errors in the tests in the past: - https://github.com/searxng/searxng/issues/2988 is one example for the Runtime issues, with non-deterministic behavior .. - https://github.com/searxng/searxng/pull/3650 - https://github.com/searxng/searxng/pull/3654 - https://github.com/searxng/searxng/pull/3642#issuecomment-2226884469 - https://github.com/searxng/searxng/pull/3746#issuecomment-2300965005 Why msgspec.Struct ================== We have already discussed typing based on e.g. `TypeDict` or `dataclass` in the past: - https://github.com/searxng/searxng/pull/1562/files - https://gist.github.com/dalf/972eb05e7a9bee161487132a7de244d2 - https://github.com/searxng/searxng/pull/1412/files - https://github.com/searxng/searxng/pull/1356 In my opinion, TypeDict is unsuitable because the objects are still dictionaries and not instances of classes / the `dataclass` are classes but ... The `msgspec.Struct` combine the advantages of typing, runtime behaviour and also offer the option of (fast) serializing (incl. type check) the objects. Currently not possible but conceivable with `msgspec`: Outsourcing the engines into separate processes, what possibilities this opens up in the future is left to the imagination! Internally, we have already defined that it is desirable to decouple the development of the engines from the development of the SearXNG core / The serialization of the `Result` objects is a prerequisite for this. HINT: The threads listed above were the template for this PR, even though the implementation here is based on msgspec. They should also be an inspiration for the following PRs of typification, as the models and implementations can provide a good direction. Why just one commit? ==================== I tried to create several (thematically separated) commits, but gave up at some point ... there are too many things to tackle at once / The comprehensibility of the commits would not be improved by a thematic separation. On the contrary, we would have to make multiple changes at the same places and the goal of a change would be vaguely recognizable in the fog of the commits. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-10-15[refactor] unit tests (continued) - pluginsGrant Lanham
This commit includes some refactoring in unit tests. As we test more plugins, it seems unweildy to include every test class in the test_plugins.py file. This patch split apart all of the test plugins to their own respective files, including the new test_plugin_calculator.py file.
2024-10-03[refactor] unit tests to utilize paramaterized and break down monolithic testsGrant Lanham
- for tests which perform the same arrange/act/assert pattern but with different data, the data portion has been moved to the ``paramaterized.expand`` fields - for monolithic tests which performed multiple arrange/act/asserts, they have been broken up into different unit tests. - when possible, change generic assert statements to more concise asserts (i.e. ``assertIsNone``) This work ultimately is focused on creating smaller and more concise tests. While paramaterized may make adding new configurations for existing tests easier, that is just a beneficial side effect. The main benefit is that smaller tests are easier to reason about, meaning they are easier to debug when they start failing. This improves the developer experience in debugging what went wrong when refactoring the project. Total number of tests went from 192 -> 259; or, broke apart larger tests into 69 more concise ones.
2024-03-11[mod] pylint all files with one profile / drop PYLINT_SEARXNG_DISABLE_OPTIONMarkus Heiser
In the past, some files were tested with the standard profile, others with a profile in which most of the messages were switched off ... some files were not checked at all. - ``PYLINT_SEARXNG_DISABLE_OPTION`` has been abolished - the distinction ``# lint: pylint`` is no longer necessary - the pylint tasks have been reduced from three to two 1. ./searx/engines -> lint engines with additional builtins 2. ./searx ./searxng_extra ./tests -> lint all other python files Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-03-09[black] upgrade black 22.12.0 --> 24.2.0Markus Heiser
The issue discussed in [1] has been solved since [2] has been merged into black / now we can upgrade without touching 69 files as it was needed with black 23.1.0 [3]. [1] https://github.com/searxng/searxng/pull/2159#issuecomment-1425723977 [2] https://github.com/psf/black/pull/4060 [3] https://github.com/searxng/searxng/pull/2159/files Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-10-22[fix] HTMLParser: undocumented not implemented methodMarkus Heiser
In python versions <py3.10 there is an issue with an undocumented method HTMLParser.error() [1][2] that was deprecated in Python 3.4 and removed in Python 3.5. To be compatible to higher versions (>=py3.10) an error method is implemented which throws an AssertionError exception like the higher Python versions do [3]. [1] https://github.com/python/cpython/issues/76025 [2] https://bugs.python.org/issue31844 [3] https://github.com/python/cpython/pull/8562 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24[mod] replace utils.match_language by locales.match_localeMarkus Heiser
This patch replaces the *full of magic* ``utils.match_language`` function by a ``locales.match_locale``. The ``locales.match_locale`` function is based on the ``locales.build_engine_locales`` introduced in 9ae409a0 [1]. In the past SearXNG did only support a search by a language but not in a region. This has been changed a long time ago and regions have been added to SearXNG core but not to the engines. The ``utils.match_language`` was the function to handle the different aspects of language/regions in SearXNG core and the supported *languages* in the engine. The ``utils.match_language`` did it with some magic and works good for most use cases but fails in some edge case. To replace the concurrence of languages and regions in the SearXNG core the ``locales.build_engine_locales`` was introduced in 9ae409a0 [1]. With the last patches all engines has been migrated to a ``fetch_traits`` and a language/region concept that is based on ``locales.build_engine_locales``. To summarize: there is no longer a need for the ``locales.match_language``. [1] https://github.com/searxng/searxng/pull/1652 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-12-16Replace langdetect with fasttextArtikusHG
2022-01-29[mod] add documentation about searx.utilsAlexandre Flament
This module is a toolbox for the engines. Is should be documented. In addition, searx/utils.py is checked by pylint.
2021-12-27[format.python] initial formatting of the python codeMarkus Heiser
This patch was generated by black [1]:: make format.python [1] https://github.com/psf/black Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-12[fix] fix match_language issue to make zh-TW match to zh-Hant-TWMarc Abonce Seguin
pybabel separates locales with underscores but we use hyphens everywhere babel doesn't directly touch
2021-09-02[mod] move searx/testing.py to the tests directoryAlexandre Flament
move robot tests to tests.robot manage calls "python -m tests.robot"
2021-08-31[pylint] Pylint 2.10 - fix redundant-u-string-prefixMarkus Heiser
Pylint 2.10 added new default checks [1]: redundant-u-string-prefix: Emitted when the u prefix is added to a string [1] https://pylint.pycqa.org/en/latest/whatsnew/2.10.html [2] https://github.com/PyCQA/pylint/issues/4102 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2020-12-03[enh] record details exception per engineAlexandre Flament
add an new API /stats/errors
2020-10-02[mod] move extract_text, extract_url to searx.utilsAlexandre Flament
2020-09-22[mod] add searx/webutils.pyAlexandre Flament
contains utility functions and classes used only by webapp.py
2020-09-13[fix] searx.utils.HTMLTextExtractor: invalid HTML don't raise an ExceptionAlexandre Flament
Close #2188
2020-09-10Drop Python 2 (4/n): SearchQuery.query is a str instead of bytesDalf
2020-09-10Drop Python 2 (1/n): remove unicode string and url_utilsDalf
2019-08-02[fix] fix flickr_noapi decoding (#1655)Alexandre Flament
Characters that were not ASCII were incorrectly decoded. Add an helper function: searx.utils.ecma_unescape (Python implementation of unescape Javascript function).
2019-07-17[fix] secret_key can be bytes instead of a string (#1602)rachmadani haryono
Fix #1600 In settings.yml, the secret_key can be written as string or as base64 encoded data using !!binary notation.
2018-03-27refactor engine's search language handlingMarc Abonce Seguin
Add match_language function in utils to match any user given language code with a list of engine's supported languages. Also add language_aliases dict on each engine to translate standard language codes into the custom codes used by the engine.
2017-05-15[enh] py3 compatibilityAdam Tauber
2016-01-10[mod] remove buildout/makefile infrastructureAdam Tauber