## Testing

Tests are an integral part of the development process.  We have a large number of tests that mostly are testing one or a few related functions (something approximating unit tests), but also some end-to-end tests, and some things that would probably be considered regression tests.

### Writing tests

All tests are found in the `tests` subdirectory.  That subdirectory includes several configuration files (including test fixtures in the standard pytest `conftest.py` file), and additional pytest fixtures in the `fixtures` subdirectory.  The actual tests are found in the other subdirectories, which (mostly) correspond to the subdirectories underneath the top level of the project where the actual code is found.

### Running tests

If you are in an environment that has all of the SeeChange prerequisites, you can run the tests by simply running the following command from the `tests` subdirectory of the project:

```
pytest -v
```

The tests have a lot of infrastructure necessary to run, however.  If you really know what you're doing, you may be able to set up the full environment.  However, most users will find it easier to use the dockerized environment designed to run with our tests.  See "Setting up a SeeChange instance" for more information.  This will also make your test environment (ideally) close to the environment in which automated tests will be run on github.

### Linting

We've also set up to use the `ruff` python linter, with a bunch of rules (enforcing a 120-character line width, demanding two blank lines before the start of a class or top-level function, looking for unused imports, looking for unused local variables, looking for deprecated numpy calls, looking for f-strings that don't need to be f-strings, and a bunch of other stuff like that).  The automated tests on github run this before the actual tests, and if any of the rules fail, the whole test will fail.  By and large, fixing these is fast and slightly annoying.

You can run the linter in your own checkout by just running:

```ruff check```

in the top level directory of the checkout; it will use the configuration in the `ruff.toml` file there.  You can pip install ruff yourself, or you can run in our testing Docker Compose environment (where ruff is installed).

**Note**: Be careful doing automated fixes of things ruff finds!  Sometimes what ruff suggests for a fix is *wrong*.  For instance, it will sometimes suggest you remove a line that does an assignment to a variable that is never used later.  However, it may well be that the rest of the code is depending on side effects of that assignment, such as filling in a lazy-loaded class variable.  Try and figure out if this is the case. (One example: some of the tests call `ImageCleanup`, and the returned variable is never used again.  The tests are dependent on that variable going out of scope when the test ends.  So, in this case, if you leave the line in, but just not capture the return value (i.e. replacing `val=ImageCleanup(...)` with just `ImageCleanup(...)`, it will break, because the return values goes out of scope immediately.)  If so, you can just assign to the variable `_` (a single underscore), or any other variable name starting with `_`, and ruff won't object that it's never used again later.

Occasionally we need to import modules that are never formally used later in a function.  For instance, the way the Instrument module works, you need to have imported the modules for any specific instruments before you call certain functions in Instrument; otherwise, those instruments will never be found later.  Ruff will object to the `import models.decam` (or whatever) call, because the module's not used, but we need it there.  In this case, you can add `# noqa: F401` at the end of the `import` line; this flags ruff that you know that this violates rule F401 (the "no unused imports") rule, and you intend to violate it; ruff then lets you get away with it.

### Testing tips

#### Database deadlocks

(This will only work when testing on your local machine; you won't be able to use this procedure if you see a deadlock on github actions.)  If your tests seem to just freeze up, it's possible you've hit a database deadlock, where two processes are waiting for the same table lock.  To see if this is the case, use `psql` to connect to your database server; if you're using either the devshell or the test docker environments, from a machine inside that environment run

```psql -h postgres -U postgres seechange````

and enter the database password (`fragile`).  Then run:

```  SELECT pid,usename,pg_blocking_pids(pid) as blocked_by,query as blocked_query
     FROM pg_stat_activity WHERE cardinality(pg_blocking_pids(pid))>0;
```

If you get any results, it means there's a database lock.  To be able to go on with your life, look at the number in the `blocked_by` column and run

```SELECT pg_terminate_backend(<number>)```

That will allow things to continue, though of course tests will fail.

The next task is figuring out where the database deadlock came from and fixing it....

#### Files left over in database / archive / disk at end of tests

The tests are supposed to clean up after themselves, so at the end of a test run there should be nothing left in the database or on the archive.  (There are some exceptions of things allowed to linger.)  If things are found at the end of the tests, this will raise errors.  Unfortunately, these errors can hide the real errors you had in your test (which may also be the reasons things were left behind!)  When debugging, you often want to turn off the check that things are left over at the end, so you can see the real errors you're getting.  Edit `tests/fixtures/conftest.py` and set the variable `verify_archive_database_empty` to `False`.  (Remember to set it back to `True` before pushing your final commit for a PR, to re-enable the leftover file tests!)

#### Test caching and data folders

Some of our tests require large datasets (mostly images).  We include a few example images in the repo itself, but most of the required data is lazy downloaded from the appropriate servers (e.g., from Noirlab).

To avoid downloading the same data over and over again, we cache the data in the `data/cache` folder.  To make sure the downloading process works as expected, users can choose to delete this folder.  Sometimes, also, tests may fail because things have changed, but there are older versions left behind in the cache; in this case, clearing out the cache directory will also solve the problem.  (One may also need to delete the `tests/temp_data` folder, if tests were interrupted.  Ideally, the tests don't depend on anything specific in there, but there may be things left behind.  Empirically, it also seems that sometimes when code changes, there is stuff left behind in `tests/temp_data` that causes tests to fail that pass after `tests/temp_data` is cleared, in addition to `data/cache` being cleared.)  In the tests, the path to this folder is given by the `cache_dir` fixture.

Note that the persistent data, that comes with the repo, is anything else in the `data` folder, which is pointed to by the `persistent_dir` fixture.

Finally, the working directory for local storage, which is referenced by the `FileOnDiskMixin.local_path` class variable, is defined in the test config YAML file, and can be accessed using the `data_dir` fixture.  This folder is systematically wiped when the tests are completed.

### Running tests on github actions

In the https://github.com/c3-time-domain/SeeChange repository, we have set up some github actions to automatically run tests upon pushes to main (which should never happen) and upon pull requests.  These tests run from the files in `tests/docker-compose.yaml`.  For them to run, some docker images must exist in the github repository at `ghcr.io/c3-time-domain`.  As of this writing, the images needed are:
* `ghcr.io/c3-time-domain/upload-connector:[tag]`
* `ghcr.io/c3-time-domain/postgres:[tag]`
* `ghcr.io/c3-time-domain/kafka:[tag]`
* `ghcr.io/c3-time-domain/seechange:[tag]`
* `ghcr.io/c3-time-domain/conductor:[tag]`
* `ghcr.io/c3-time-domain/seechange-webap:[tag]`

where `[tag]` is the current tag expected by that compose file.You can figure out what this should be by finding a line in `tests/docker-compose.yaml` like:
```
    image: ghcr.io/${GITHUB_REPOSITORY_OWNER:-c3-time-domain}/seechange:${IMGTAG:-tests20240628}
```
The thing in between `${IMTAG:-` and `}` is the tag— in this example, `tests20240628`.

(If you look at `docker-compose.yaml`, you will also see that it depends on the `mailhog`, image, but this is a standard image that can be pulled from `docker.io`, so you don't need to worry about it.)

Normally, all of these image should already be present in the `ghcr.io` archive, and you don't need to do anything.  However, if they aren't there, you need to build and push them.

To build and push the docker images, in the `tests` subdirectory run:
```
   IMGTAG=[tag] docker compose build
```
where [tag] is exactly what you see in the `docker-compose.yaml` file, followed by:
```
   docker push ghcr.io/c3-time-domain/upload-connector:[tag]
   docker push ghcr.io/c3-time-domain/postgres:[tag]
   docker push ghcr.io/c3-time-domain/seechange:[tag]
   docker push ghcr.io/c3-time-domain/conductor:[tag]
   docker push ghcr.io/c3-time-domain/seechange-webap:[tag]
   docker push ghcr.io/c3-time-domain/kafka:[tag]
```

For this push to work, you must have the requisite permissions on the `c3-time-domain` organizaton at github.

### When making changes to dockerfiles or pip requirements

This set of docker images depend on the following files:
* `docker/application/*`
* `docker/postgres/*`
* `webap/*`
* `requirements.txt`

If you change any of those files, you will need to build and push new docker images.  Before doing that, edit `tests/docker-compose.yaml` and bump the date part of the tag for _every_ image (search and replace is your friend), so that your changed images will only get used for your branch while you're still finalizing your pull request, and so that the updated images will get used by everybody else once your branch has been merged to main.

### Updating the webap image

Even if you haven't changed the files mentioned in the previous section, if you make any edits to the webap, or if you make any changes to the code that change the behavior of the webap, you will need to update the webap's image.  This will often not be necessary.  However, if you see all the webap tests passing on your local machine but some of them failing on github actions, it may be because the webap image that github is using needs to be updated.  When this happens, do the steps described in "Running tests on github actions" above (bump the test version, `docker compose build`, and a bunch of `docker push`), or contact Rob to ask him to help you do that if you aren't able to do it all yourself.